Skip to content

Push additional parquet filtering into the parquet scan [EPIC]  #3147

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The more filtering that can be pushed to the parquet reading, the faster a query will run in general as less work is needed to decode and process data that will eventually be filtered from the plan

There are several ongoing workstreams that will eventually lead to pushing down substantial additional filtering into the parquet scan that should substantially increase performance for datafusion. I wanted to capture them here to provide more visibility

cc @Ted-Jiang @tustvold @thinkharderdev

Describe the solution you'd like
Here are some of the tasks I have collected. There are likely more -- please add them (either directly or via comments)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions