Skip to content

Conversation

@YangShaw
Copy link
Contributor

@YangShaw YangShaw commented Nov 18, 2022

Proposed changes

Issue Number: close #xxx

Problem summary

support WindowFunction in nereids;

Checklist(Required)

  1. Does it affect the original behavior:
    • Yes
    • No
    • I don't know
  2. Has unit tests been added:
    • Yes
    • No
    • No Need
  3. Has document been added or modified:
    • Yes
    • No
    • No Need
  4. Does it need to update dependencies:
    • Yes
    • No
  5. Are there any changes that cannot be rolled back:
    • Yes (If Yes, please explain WHY)
    • No

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added area/nereids area/planner Issues or PRs related to the query planner labels Nov 18, 2022
@YangShaw YangShaw force-pushed the analytic_node branch 3 times, most recently from 2deb44f to 04696e9 Compare November 21, 2022 17:20
@hello-stephen
Copy link
Contributor

hello-stephen commented Dec 28, 2022

TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 35.21 seconds
stream load tsv: 462 seconds loaded 74807831229 Bytes, about 154 MB/s
stream load json: 39 seconds loaded 2358488459 Bytes, about 57 MB/s
stream load orc: 68 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 28 seconds loaded 861443392 Bytes, about 29 MB/s
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230213131555_clickbench_pr_95395.html

@YangShaw YangShaw marked this pull request as ready for review January 18, 2023 05:49
@YangShaw YangShaw force-pushed the analytic_node branch 3 times, most recently from 09c9cda to f81180b Compare January 31, 2023 03:33
Comment on lines 51 to 58
/**
* translate WindowFrame to AnalyticWindow
*/
default AnalyticWindow translateWindowFrame(WindowFrame windowFrame, PlanTranslatorContext context) {
FrameUnitsType frameUnits = windowFrame.getFrameUnits();
FrameBoundary leftBoundary = windowFrame.getLeftBoundary();
FrameBoundary rightBoundary = windowFrame.getRightBoundary();

AnalyticWindow.Type type = frameUnits == FrameUnitsType.ROWS
? AnalyticWindow.Type.ROWS : AnalyticWindow.Type.RANGE;

AnalyticWindow.Boundary left = withFrameBoundary(leftBoundary, context);
AnalyticWindow.Boundary right = withFrameBoundary(rightBoundary, context);

return new AnalyticWindow(type, left, right);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it it better that put translate functions in translator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

due to that AnalyticWindow doesn't extend Expr, it is hard to use visitWindowFrame() in ExpressionTranslator. And in consider of that the code related of translate windowFrame is long, I put these functions in Window interface rather than PhysicalPlanTranslator.

Comment on lines 1533 to 1563
if (containsWindowExpressions(projects)) {
return new LogicalWindow<>(projects, input);
}
return new LogicalProject<>(projects, Collections.emptyList(), input, isDistinct);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should generate LogicalWindow after we do bind.
we chould just treat windowExpression as a scalar expression.

  • if we use window function without agg. Then window function will in Projections.
    we could generate LogicalProject(LogicalWindow(LogicalProject)) to handle all case.
  • if we use window function in agg. Then window function will in Agg's output expressions. After normalize agg it will present in the Project topping on Aggregate. and then we could do the same thing as first scene.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice idea~

@YangShaw YangShaw force-pushed the analytic_node branch 2 times, most recently from 780f3c9 to 3c3cd0f Compare February 10, 2023 01:35
fuyu29 added 2 commits February 11, 2023 13:23
* which is an UnboundFunction at first and will be analyzed as relevant BoundFunction
* (can be a WindowFunction or AggregateFunction) after BindFunction.
*/
public class WindowExpression extends Expression implements PropagateNullable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public class WindowExpression extends Expression implements PropagateNullable {
public class WindowExpression extends Expression {
@Override
public boolean nullable() {
return function.nullable();
}

Comment on lines 78 to 81
@Override
public DataType getDataType() {
return IntegerType.INSTANCE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this function

Comment on lines 64 to 67
@Override
public DataType getDataType() {
return IntegerType.INSTANCE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

Comment on lines 93 to 96
SELECT *, row_number() over(partition by c1)
FROM (
SELECT *, row_number() over(partition by c2)
FROM window_test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need order by in over to get a stable result

FunctionSignature.ret(LargeIntType.INSTANCE).args(LargeIntType.INSTANCE)
);

private Expression buckets;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bucket is not use anymore

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 13, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morrySnow morrySnow merged commit 77a3288 into apache:master Feb 13, 2023
YangShaw added a commit to YangShaw/doris that referenced this pull request Feb 17, 2023
924060929 added a commit to 924060929/incubator-doris that referenced this pull request Mar 13, 2025
fix distinct window compute wrong result, introduced by apache#14397
```sql
select distinct sum(value) over(partition by id)
from (
  select 100 value, 1 id
  union all
  select 100, 2
)a;

+----------------------------------+
| sum(value) over(partition by id) |
+----------------------------------+
| 100                              |
| 100                              |
+----------------------------------+
```
924060929 added a commit to 924060929/incubator-doris that referenced this pull request Mar 13, 2025
fix distinct window compute wrong result, introduced by apache#14397
```sql
select distinct sum(value) over(partition by id)
from (
  select 100 value, 1 id
  union all
  select 100, 2
)a;

+----------------------------------+
| sum(value) over(partition by id) |
+----------------------------------+
| 100                              |
| 100                              |
+----------------------------------+
```
morrySnow added a commit that referenced this pull request Mar 13, 2025
### What problem does this PR solve?

Related PR: #21727 #14397

Problem Summary:

1. forgot to copy isChecked flag in LogicalWindow when do deep copy
2. implement LogicalWindow To PhyscialWindow should not check isChecked
flag

This PR:
1. check deep copy for all plan node
2. remove check isChecked in LogicalWindow To PhyscialWindow
github-actions bot pushed a commit that referenced this pull request Mar 13, 2025
### What problem does this PR solve?

Related PR: #21727 #14397

Problem Summary:

1. forgot to copy isChecked flag in LogicalWindow when do deep copy
2. implement LogicalWindow To PhyscialWindow should not check isChecked
flag

This PR:
1. check deep copy for all plan node
2. remove check isChecked in LogicalWindow To PhyscialWindow
github-actions bot pushed a commit that referenced this pull request Mar 13, 2025
### What problem does this PR solve?

Related PR: #21727 #14397

Problem Summary:

1. forgot to copy isChecked flag in LogicalWindow when do deep copy
2. implement LogicalWindow To PhyscialWindow should not check isChecked
flag

This PR:
1. check deep copy for all plan node
2. remove check isChecked in LogicalWindow To PhyscialWindow
morrySnow added a commit that referenced this pull request Mar 13, 2025
Related PR: #21727 #14397

Problem Summary:

1. forgot to copy isChecked flag in LogicalWindow when do deep copy
2. implement LogicalWindow To PhyscialWindow should not check isChecked
flag

This PR:
1. check deep copy for all plan node
2. remove check isChecked in LogicalWindow To PhyscialWindow
morrySnow added a commit that referenced this pull request Mar 13, 2025
Related PR: #21727 #14397

Problem Summary:

1. forgot to copy isChecked flag in LogicalWindow when do deep copy
2. implement LogicalWindow To PhyscialWindow should not check isChecked
flag

This PR:
1. check deep copy for all plan node
2. remove check isChecked in LogicalWindow To PhyscialWindow
924060929 added a commit to 924060929/incubator-doris that referenced this pull request Mar 13, 2025
fix distinct window compute wrong result, introduced by apache#14397
```sql
select distinct sum(value) over(partition by id)
from (
  select 100 value, 1 id
  union all
  select 100, 2
)a;

+----------------------------------+
| sum(value) over(partition by id) |
+----------------------------------+
| 100                              |
| 100                              |
+----------------------------------+
```
924060929 added a commit to 924060929/incubator-doris that referenced this pull request Mar 13, 2025
fix distinct window compute wrong result, introduced by apache#14397
```sql
select distinct sum(value) over(partition by id)
from (
  select 100 value, 1 id
  union all
  select 100, 2
)a;

+----------------------------------+
| sum(value) over(partition by id) |
+----------------------------------+
| 100                              |
| 100                              |
+----------------------------------+
```
924060929 added a commit to 924060929/incubator-doris that referenced this pull request Mar 13, 2025
fix distinct window compute wrong result, introduced by apache#14397
```sql
select distinct sum(value) over(partition by id)
from (
  select 100 value, 1 id
  union all
  select 100, 2
)a;

+----------------------------------+
| sum(value) over(partition by id) |
+----------------------------------+
| 100                              |
| 100                              |
+----------------------------------+
```
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

Related PR: apache#21727 apache#14397

Problem Summary:

1. forgot to copy isChecked flag in LogicalWindow when do deep copy
2. implement LogicalWindow To PhyscialWindow should not check isChecked
flag

This PR:
1. check deep copy for all plan node
2. remove check isChecked in LogicalWindow To PhyscialWindow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/nereids area/planner Issues or PRs related to the query planner kind/test reviewed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants