Skip to content

[Python] RowGroup filtering on file level #17793

@asfimport

Description

@asfimport

We can build upon the API defined in fastparquet for defining RowGroup filters: https://github.com/dask/fastparquet/blob/master/fastparquet/api.py#L296-L300 and translate them into the C++ enums we will define in https://issues.apache.org/jira/browse/PARQUET-1158 . This should enable us to provide the user with a simple predicate pushdown API that we can extend in the background from RowGroup to Page level later on.

Reporter: Uwe Korn / @xhochy
Assignee: Joris Van den Bossche / @jorisvandenbossche

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-1796. Please see the migration documentation for further details.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions