-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[opt](nereids)prune unused column after push down common column from agg #46627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 30745 ms |
morrySnow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add test case please~
|
run buildall |
TPC-H: Total hot run time: 28707 ms |
|
run buildall |
TPC-H: Total hot run time: 26186 ms |
TPC-DS: Total hot run time: 182387 ms |
|
run buildall |
| Expr result = ExpressionTranslator.translate(e, context); | ||
| if (result == null) { | ||
| throw new RuntimeException("translate " + e + " failed"); | ||
| } | ||
| return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we get more info about translate failed and print them in log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this message is logged by NereidsPlanner
| .flatMap(expr -> expr.getInputSlots().stream()) | ||
| .collect(Collectors.toSet()); | ||
| for (NamedExpression expr : project.getProjects()) { | ||
| if (!(expr instanceof SlotReference) || newInputSlots.contains(expr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need !(expr instanceof SlotReference)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is not slot, it is alias. if it is alias, it must be used by Aggregate in 2 cases:
- directly used by Agg after rewrite
- used by inferred common expression.
In both cases, alias should not be pruned.
TPC-H: Total hot run time: 31992 ms |
TPC-DS: Total hot run time: 193426 ms |
ClickBench: Total hot run time: 30.76 s |
|
run buildall |
TPC-H: Total hot run time: 32550 ms |
TPC-DS: Total hot run time: 195265 ms |
ClickBench: Total hot run time: 31 s |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…agg (#46627) ### What problem does this PR solve? after extracting common expressions for agg, the underlying projection may project redundant columns. for example: original plan Agg(groupkey=[A+B, A+B+1]) --> project(A, B) after extracting, "A+B as C" is detected as a common expression, and the plan becomes Agg(groupKey=[C, C+1]) -->project(A, B, A+B as C) here A, B should not be projected, since they are not used any more. so the optimal plan is Agg(groupKey=[C, C+1]) -->project(A+B as C) Related PR: #40473
…agg (apache#46627) ### What problem does this PR solve? after extracting common expressions for agg, the underlying projection may project redundant columns. for example: original plan Agg(groupkey=[A+B, A+B+1]) --> project(A, B) after extracting, "A+B as C" is detected as a common expression, and the plan becomes Agg(groupKey=[C, C+1]) -->project(A, B, A+B as C) here A, B should not be projected, since they are not used any more. so the optimal plan is Agg(groupKey=[C, C+1]) -->project(A+B as C) Related PR: apache#40473
What problem does this PR solve?
after extracting common expressions for agg, the underlying projection may project redundant columns.
for example:
original plan
Agg(groupkey=[A+B, A+B+1])
--> project(A, B)
after extracting, "A+B as C" is detected as a common expression, and the plan becomes
Agg(groupKey=[C, C+1])
-->project(A, B, A+B as C)
here A, B should not be projected, since they are not used any more. so the optimal plan is
Agg(groupKey=[C, C+1])
-->project(A+B as C)
Related PR: #40473
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)