-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[opt](ditributed-plan) bucket shuffle down grade only check not shuffle side #59506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 35730 ms |
TPC-DS: Total hot run time: 174649 ms |
ClickBench: Total hot run time: 27.1 s |
FE UT Coverage ReportIncrement line coverage |
2e31cce to
eb05d6b
Compare
|
run buildall |
TPC-H: Total hot run time: 31194 ms |
TPC-DS: Total hot run time: 172753 ms |
FE UT Coverage ReportIncrement line coverage |
ClickBench: Total hot run time: 26.97 s |
FE Regression Coverage ReportIncrement line coverage |
1 similar comment
FE Regression Coverage ReportIncrement line coverage |
|
run NonConcurrent |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request refactors the bucket shuffle join strategy logic in the Nereids query planner to improve how bucket shuffle downgrades are detected and handled. The changes introduce explicit control flags to determine when to check for bucket shuffle downgrades, replacing the previous scattered downgrade checks with a more structured approach.
Changes:
- Introduced
shouldCheckLeftBucketDownGradeandshouldCheckRightBucketDownGradeflags to control when downgrade checks occur - Refactored conditional logic to consolidate bucket shuffle downgrade checks at the end of the method instead of inline
- Updated test output files to reflect improved bucket shuffle join selection in TPC-DS queries
Reviewed changes
Copilot reviewed 39 out of 39 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| ChildrenPropertiesRegulator.java | Refactors the visitPhysicalHashJoin method to use control flags for bucket shuffle downgrade checks, improving code organization and correctness |
| query72.out | Updated query plan output showing improved join strategy selection with bucketShuffle joins |
| query54.out | Updated query plan output showing improved join strategy selection with shuffleBucket joins |
| query82.out (multiple variants) | Updated query plan output showing broadcast to bucketShuffle join strategy improvement |
| query78.out (multiple variants) | Updated query plan output showing shuffle to bucketShuffle join strategy improvement |
| query5.out (multiple variants) | Updated query plan output showing shuffle to bucketShuffle join strategy improvement |
| query39.out (multiple variants) | Updated query plan output showing broadcast to bucketShuffle join strategy improvement |
| query37.out (multiple variants) | Updated query plan output showing broadcast to bucketShuffle join strategy improvement |
| query22.out (multiple variants) | Updated query plan output showing broadcast to bucketShuffle join strategy improvement |
| query21.out (multiple variants) | Updated query plan output showing broadcast to bucketShuffle join strategy improvement |
| query80.out (multiple variants) | Updated query plan output showing shuffle to bucketShuffle join strategy improvement |
| query75.out (multiple variants) | Updated query plan output showing shuffle to bucketShuffle join strategy improvement |
| query49.out (multiple variants) | Updated query plan output showing shuffle to bucketShuffle join strategy improvement |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } else if ((leftHashSpec.getShuffleType() == ShuffleType.STORAGE_BUCKETED | ||
| && rightHashSpec.getShuffleType() == ShuffleType.EXECUTION_BUCKETED)) { | ||
| if (children.get(0).getPlan() instanceof PhysicalDistribute) { | ||
| shouldCheckrightBucketDownGrade = true; |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in the variable name. The variable should be named "shouldCheckRightBucketDownGrade" (capital 'R'), not "shouldCheckrightBucketDownGrade" (lowercase 'r'). This inconsistency with the naming of "shouldCheckLeftBucketDownGrade" makes the code harder to read and maintain.
| (DistributionSpecHash) requiredProperties.get(0).getDistributionSpec(), | ||
| (DistributionSpecHash) requiredProperties.get(1).getDistributionSpec())); | ||
| if (children.get(0).getPlan() instanceof PhysicalDistribute) { | ||
| shouldCheckrightBucketDownGrade = true; |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in the variable name. The variable should be named "shouldCheckRightBucketDownGrade" (capital 'R'), not "shouldCheckrightBucketDownGrade" (lowercase 'r'). This inconsistency with the naming of "shouldCheckLeftBucketDownGrade" makes the code harder to read and maintain.
| (DistributionSpecHash) requiredProperties.get(0).getDistributionSpec(), | ||
| (DistributionSpecHash) requiredProperties.get(1).getDistributionSpec())); | ||
| } | ||
| if (shouldCheckrightBucketDownGrade && isBucketShuffleDownGrade(rightChild)) { |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in the variable name. The variable should be named "shouldCheckRightBucketDownGrade" (capital 'R'), not "shouldCheckrightBucketDownGrade" (lowercase 'r'). This inconsistency with the naming of "shouldCheckLeftBucketDownGrade" makes the code harder to read and maintain.
| Optional<PhysicalProperties> updatedForRight = Optional.empty(); | ||
|
|
||
| boolean shouldCheckLeftBucketDownGrade = false; | ||
| boolean shouldCheckrightBucketDownGrade = false; |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in the variable name. The variable should be named "shouldCheckRightBucketDownGrade" (capital 'R'), not "shouldCheckrightBucketDownGrade" (lowercase 'r'). This inconsistency with the naming of "shouldCheckLeftBucketDownGrade" makes the code harder to read and maintain.
| boolean shouldCheckrightBucketDownGrade = false; | |
| boolean shouldCheckRightBucketDownGrade = false; |
| } else if ((leftHashSpec.getShuffleType() == ShuffleType.EXECUTION_BUCKETED | ||
| && rightHashSpec.getShuffleType() == ShuffleType.STORAGE_BUCKETED)) { | ||
| if (children.get(0).getPlan() instanceof PhysicalDistribute) { | ||
| shouldCheckrightBucketDownGrade = true; |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a typo in the variable name. The variable should be named "shouldCheckRightBucketDownGrade" (capital 'R'), not "shouldCheckrightBucketDownGrade" (lowercase 'r'). This inconsistency with the naming of "shouldCheckLeftBucketDownGrade" makes the code harder to read and maintain.
TPC-H: Total hot run time: 31248 ms |
TPC-DS: Total hot run time: 174030 ms |
FE Regression Coverage ReportIncrement line coverage |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
2 similar comments
FE Regression Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
…le side (apache#59506) Related PR: apache#36784 This pull request refactors the logic in the `visitPhysicalHashJoin` method to improve the handling and enforcement of bucket shuffle join strategies in the Nereids planner. It introduces new checks and flags to more accurately determine when bucket shuffle downgrades are needed, and updates the plan output to reflect the use of `bucketShuffle` joins in various TPC-DS queries. Key changes include: ### Planner logic improvements * Added `shouldCheckLeftBucketDownGrade` and `shouldCheckrightBucketDownGrade` flags to control when to check for bucket shuffle downgrades, leading to more precise enforcement of shuffle strategies. * Refactored conditional logic to use these new flags, replacing previous direct downgrade checks, and improved the handling of cases where shuffle key orders do not match. These changes enhance the accuracy and efficiency of join planning in distributed query execution.
What problem does this PR solve?
Related PR: #36784
Problem Summary:
This pull request refactors the logic in the
visitPhysicalHashJoinmethod to improve the handling and enforcement of bucket shuffle join strategies in the Nereids planner. It introduces new checks and flags to more accurately determine when bucket shuffle downgrades are needed, and updates the plan output to reflect the use ofbucketShufflejoins in various TPC-DS queries.Key changes include:
Planner logic improvements
shouldCheckLeftBucketDownGradeandshouldCheckrightBucketDownGradeflags to control when to check for bucket shuffle downgrades, leading to more precise enforcement of shuffle strategies.These changes enhance the accuracy and efficiency of join planning in distributed query execution.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)