feat: comet native scan improvements - Dynamic Partition Pruning #3546
Open
Shekharrajak wants to merge 14 commits intoapache:mainfrom
Open
feat: comet native scan improvements - Dynamic Partition Pruning #3546Shekharrajak wants to merge 14 commits intoapache:mainfrom
Shekharrajak wants to merge 14 commits intoapache:mainfrom
Conversation
5099c01 to
00fc8ce
Compare
cb700d9 to
f8dd8c8
Compare
Shekharrajak
commented
Feb 23, 2026
| : : : +- BroadcastHashJoin | ||
| : : : :- Filter | ||
| : : : : +- ColumnarToRow | ||
| : : : : +- Scan parquet spark_catalog.default.store_returns [COMET: Native DataFusion scan does not support subqueries/dynamic pruning] |
Contributor
Author
There was a problem hiding this comment.
CI checks where failing and hence need to update them
Member
There was a problem hiding this comment.
I'd suggest not changing the fallback message for this PR and have a follow on PR to improve the message, so that this PR is smaller and just focuses on the functionality.
Another option is to add a new config to feature gate the DPP support and disable it for now in the stability suite.
Contributor
Author
There was a problem hiding this comment.
Thanks @andygrove for review, Actually unit tests were failing and hence I had to update in PR itself to make all checks green.
andygrove
reviewed
Mar 16, 2026
| val scanImpl = COMET_NATIVE_SCAN_IMPL.get() | ||
|
|
||
| // native_datafusion + DPP requires AQE. Without AQE, DPP subqueries aren't prepared | ||
| // before the scan tries to use their results, causing "has not finished" errors. |
Contributor
Author
|
Please trigger the CI checks |
Contributor
Author
|
CI check failure is due to network issue : How can we re-trigger those 2 failing CI workflow ? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Ref #3510
Rationale for this change
CometNativeScanExec currently falls back to Spark when Dynamic Partition Pruning (DPP) is present. This limits performance for star-schema queries that rely on DPP to prune partitions at
runtime based on dimension table filters.
What changes are included in this PR?
Added DPP support to CometNativeScanExec for V1 native scans
Implemented partition filter evaluation from DPP subqueries
How are these changes tested?
Added DPP benchmark comparing Spark vs Comet native scan performance
Unit tests