-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
We can build upon the API defined in fastparquet for defining RowGroup filters: https://github.com/dask/fastparquet/blob/master/fastparquet/api.py#L296-L300 and translate them into the C++ enums we will define in https://issues.apache.org/jira/browse/PARQUET-1158 . This should enable us to provide the user with a simple predicate pushdown API that we can extend in the background from RowGroup to Page level later on.
Reporter: Uwe Korn / @xhochy
Assignee: Joris Van den Bossche / @jorisvandenbossche
Related issues:
- [C++] Basic RowGroup filtering (is related to)
- [Python][Dataset] Support using dataset API in pyarrow.parquet with a minimal ParquetDataset shim (depends upon)
PRs and other links:
Note: This issue was originally created as ARROW-1796. Please see the migration documentation for further details.