-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feat](unique function) add project for unique function #48449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
1a14013 to
3e77ec9
Compare
3e77ec9 to
3d89334
Compare
|
run buildall |
TPC-H: Total hot run time: 34297 ms |
TPC-DS: Total hot run time: 187103 ms |
ClickBench: Total hot run time: 33.03 s |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 34135 ms |
TPC-DS: Total hot run time: 186740 ms |
ClickBench: Total hot run time: 32.24 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
| // extract unique function which exist multiple times from targets, | ||
| // then alias the unique function and put them into a child project, | ||
| // then rewrite targets with the alias names. | ||
| private <T extends Expression> Optional<Pair<List<T>, LogicalProject<Plan>>> rewriteExpressions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add UT for these functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have added
| // if a unique function exists multiple times in the targets, then add a project to alias it. | ||
| private List<NamedExpression> tryGenUniqueFunctionAlias(Collection<? extends Expression> targets) { | ||
| Map<UniqueFunction, Integer> unqiueFunctionCounter = Maps.newLinkedHashMap(); | ||
| targets.forEach(target -> target.foreach(e -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use for to instead forEach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
|
run buildall |
1 similar comment
|
run buildall |
TPC-H: Total hot run time: 34441 ms |
TPC-DS: Total hot run time: 186731 ms |
ClickBench: Total hot run time: 32.55 s |
FE Regression Coverage ReportIncrement line coverage |
6cd8006 to
4b43e8f
Compare
|
run buildall |
TPC-H: Total hot run time: 34170 ms |
TPC-DS: Total hot run time: 187515 ms |
ClickBench: Total hot run time: 32.9 s |
FE UT Coverage ReportIncrement line coverage |
|
run external |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
### What problem does this PR solve? If an unique function exists multiple times, then be will calculate it multiple times for each row, so it will be error. for example: `filter(random() between 10 and 20)`, after rewrite the `between`, it will get `filter(random() >= 10 and random() <= 20)`, this will contains two random in one expression, the two RANDOM should be one, so we add a project, then we can get `filter(k >= 10 and k <= 20) -> project(random() as k)` this PR also fix BETWEEN expression bug introduced by apache#55407 - between shouldn't be PropagateNullable because FoldConstantRuleOnFE will rewrite a propagate nullable expression to null if any children is NULL, but for sql `10 between null and 5` should be `FALSE`, not `NULL`; - after analyzed between expression, it will get an AND expression, then anlyzed join other conjunctions, need to extract conjuncts of each analyzed other conjunction (that is flattern AND).
…`betweenAnd` expression class and included some work related to unique functions. This time, the pick only included the changes related to `betweenAnd`.
What problem does this PR solve?
If an unique function exists multiple times, then be will calculate it multiple times for each row, so it will be error.
for example:
filter(random() between 10 and 20), after rewrite thebetween, it will getfilter(random() >= 10 and random() <= 20), this will contains two random in one expression, the two RANDOM should be one, so we add a project, then we can getfilter(k >= 10 and k <= 20) -> project(random() as k)this PR also fix BETWEEN expression bug introduced by #55407
10 between null and 5should beFALSE, notNULL;Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)