-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[enhance](nereids) add rewrite rule SplitJoinForNullSkew #44357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
3985e79 to
131b44a
Compare
|
run buildall |
7c0ed99 to
82e7176
Compare
|
run buildall |
c96d557 to
1b40cc5
Compare
|
run buidall |
|
run buildall |
|
run p0 |
TPC-H: Total hot run time: 39963 ms |
TPC-DS: Total hot run time: 198243 ms |
ClickBench: Total hot run time: 32.02 s |
| public Rule build() { | ||
| return logicalJoin(any(), any()) | ||
| .when(join -> join.getJoinType().isLeftJoin()) | ||
| .when(join -> join.getHashJoinConjuncts().size() == 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should check mark join conjuncts
| .when(join -> join.getJoinType().isLeftJoin()) | ||
| .when(join -> join.getHashJoinConjuncts().size() == 1) | ||
| .thenApply(ctx -> { | ||
| Set<Integer> enableNereidsRules = ctx.cascadesContext.getConnectContext() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need add this? enableRules removed in this PR: #44769
| @Override | ||
| public Rule build() { | ||
| return logicalJoin(any(), any()) | ||
| .when(join -> join.getJoinType().isLeftJoin()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not process right outer join?
| Plan deepCopyJoin = LogicalPlanDeepCopier.INSTANCE.deepCopy(newJoin, new DeepCopierContext()); | ||
|
|
||
| // avoid duplicate application of rules | ||
| if (left instanceof LogicalFilter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just check filter's conjuncts contain is not null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are right, I make it too complicated.
| .toRule(RuleType.JOIN_SPLIT_FOR_NULL_SKEW); | ||
| } | ||
|
|
||
| private Plan splitJoin(LogicalJoin<Plan, Plan> join) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not rewrite if leftExpr is not null already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ae357a1 to
ff5921c
Compare
|
run buildall |
ff5921c to
94c526a
Compare
|
run buildall |
TPC-H: Total hot run time: 39859 ms |
TPC-DS: Total hot run time: 198205 ms |
ClickBench: Total hot run time: 32.28 s |
b688cb2 to
9bcb488
Compare
|
run buildall |
TPC-H: Total hot run time: 31583 ms |
TPC-DS: Total hot run time: 185766 ms |
ClickBench: Total hot run time: 30.46 s |
861da34 to
6bb80b0
Compare
|
run buildall |
TPC-H: Total hot run time: 31676 ms |
TPC-DS: Total hot run time: 190308 ms |
6bb80b0 to
5c1c87b
Compare
ClickBench: Total hot run time: 30.13 s |
5c1c87b to
2b18302
Compare
|
run buildall |
TPC-H: Total hot run time: 31933 ms |
TPC-DS: Total hot run time: 184228 ms |
ClickBench: Total hot run time: 30.22 s |
rule can execute Avoid duplicate application of rules add comment add test remove unrelated code add test modify code by comments fix compile
2b18302 to
2068444
Compare
What problem does this PR solve?
add transform rule:
only support left join with only 1 hash join conjuncts , and without other join conjuncts.
Since there is sometimes null value skew on the join key, which can lead to prolonged execution times, the join is split into two parts based on whether the join key is null. This can accelerate the query.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)