-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize the reader and input interfaces #1533
Comments
This was referenced May 9, 2022
This was referenced May 14, 2022
zhejiangxiaomai
pushed a commit
to zhejiangxiaomai/velox
that referenced
this issue
Jun 21, 2022
…ookincubator#1526) Summary: SeekableInputStream will be used by the column readers for different file formats in the near future. This PR decouples the ORC RowGroup concept from the SeekableInputStream by renaming seekToRowGroup() to seekToPosition(), so it can be generalized for different file formats that do not support ORC RowGroups. This is the first PR to resolve facebookincubator#1533 Pull Request resolved: facebookincubator#1526 Reviewed By: zzhao0 Differential Revision: D36348871 Pulled By: oerling fbshipit-source-id: ba0baf16c7951f86a5da6f51471800eed6ba3134
shiyu-bytedance
pushed a commit
to shiyu-bytedance/velox-1
that referenced
this issue
Aug 18, 2022
…ookincubator#1526) Summary: SeekableInputStream will be used by the column readers for different file formats in the near future. This PR decouples the ORC RowGroup concept from the SeekableInputStream by renaming seekToRowGroup() to seekToPosition(), so it can be generalized for different file formats that do not support ORC RowGroups. This is the first PR to resolve facebookincubator#1533 Pull Request resolved: facebookincubator#1526 Reviewed By: zzhao0 Differential Revision: D36348871 Pulled By: oerling fbshipit-source-id: ba0baf16c7951f86a5da6f51471800eed6ba3134
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Velox will support multiple file formats like Parquet, ORC, Alpha in the future. These readers, like the DWRF reader, shall use some common components like the InputStream's, decoding and decompression utility functions, etc. Further more, they may directly inherit the ColumnReader's that were originally made for DWRF. To prep for the upcoming native Parquet reader, we propose to do the following refactoring:
1. Move SeekableInputStream, BufferredInput, (de)compressor related classes to velox::dwio::common::io
2. Move some DWRF utility headers to dwio::common #1619
3. Compatibility support with DuckDB parquet reader
4. Generalize the ColumnReader and SelectiveColumnReader classes #1620
5. Tests
The tests for the classes that moved to dwio::common will also be moved to that namespace/folders
The text was updated successfully, but these errors were encountered: