Skip to content

Conversation

@zhengshiJ
Copy link
Contributor

@zhengshiJ zhengshiJ commented Apr 11, 2022

The purpose of this function is to modify the derivation of the current statistics.
There are two modifications:

  1. The statistical information derivation is removed from the Node node.
  2. Modify the current way of obtaining cardinality, and change it to obtain it from statistical information.
    For details on the modification method, see: https://shimo.im/docs/473QyOzWBjHMlZ3w
    The implementation of this function is divided into multiple submissions, and this submission implements the derivation of scan_node statistics.

Proposed changes

Issue Number: close #9241

Problem Summary:

The derivation of the current statistical information is coupled in the node node, and the method of calculating the cardinality is carried out by the mock method, and the ndv value is not obtained correctly.

Checklist(Required)

  1. Does it affect the original behavior: (No)
  2. Has unit tests been added: (No)
  3. Has document been added or modified: (No Need)
  4. Does it need to update dependencies: (No)
  5. Are there any changes that cannot be rolled back: (No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added the area/planner Issues or PRs related to the query planner label Apr 11, 2022
@EmmyMiao87 EmmyMiao87 added the area/statistics Issues or PRS related to statistics label Apr 13, 2022
@EmmyMiao87 EmmyMiao87 self-assigned this Apr 13, 2022
@zhengshiJ zhengshiJ force-pushed the master branch 2 times, most recently from 66972cf to 9c2ba3a Compare April 20, 2022 06:38
@github-actions github-actions bot added the area/load Issues or PRs related to all kinds of load label Apr 20, 2022

// Currently it simply adds the number of rows of children
protected long deriveRowCount() {
applyConjunctsSelectivity();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
applyConjunctsSelectivity();
rowcount = children max(rowcount)
applyConjunctsSelectivity();

EmmyMiao87
EmmyMiao87 previously approved these changes Apr 25, 2022
Copy link
Contributor

@EmmyMiao87 EmmyMiao87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 25, 2022
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@EmmyMiao87
Copy link
Contributor

The FE ut is failed case:

  1. AlterViewStmtTest.testNormal
  2. TableQueryPlanActionTest.testQueryPlanAction

Those case will be fixed in PR: #9158

The BE unit test problem is independent of the current pr.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 26, 2022
Copy link
Contributor

@EmmyMiao87 EmmyMiao87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 26, 2022
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@EmmyMiao87 EmmyMiao87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code base line is incorrect

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 26, 2022
@github-actions github-actions bot added area/routine load area/spark-load Issues or PRs related to the spark load area/sql/function Issues or PRs related to the SQL functions area/vectorization kind/docs Categorizes issue or PR as related to documentation. kind/test labels Apr 27, 2022
Copy link
Contributor

@EmmyMiao87 EmmyMiao87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 28, 2022
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@yiguolei yiguolei merged commit 9ef09b8 into apache:master Apr 29, 2022
@morningman morningman added the dev/backlog waiting to be merged in future dev branch label Apr 29, 2022
zhengshiJ added a commit to zhengshiJ/incubator-doris that referenced this pull request May 9, 2022
apache#8947)

* [feature](statistics) Statistics derivation.Step 1:ScanNode implementation

Co-authored-by: jianghaochen <jianghaochen@meituan.com>
Kikyou1997 pushed a commit to Kikyou1997/incubator-doris that referenced this pull request May 9, 2022
apache#8947)

* [feature](statistics) Statistics derivation.Step 1:ScanNode implementation

Co-authored-by: jianghaochen <jianghaochen@meituan.com>
starocean999 pushed a commit to starocean999/incubator-doris that referenced this pull request May 19, 2022
apache#8947)

* [feature](statistics) Statistics derivation.Step 1:ScanNode implementation

Co-authored-by: jianghaochen <jianghaochen@meituan.com>
englefly pushed a commit to englefly/incubator-doris that referenced this pull request May 23, 2022
apache#8947)

* [feature](statistics) Statistics derivation.Step 1:ScanNode implementation

Co-authored-by: jianghaochen <jianghaochen@meituan.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/load Issues or PRs related to all kinds of load area/planner Issues or PRs related to the query planner area/routine load area/spark-load Issues or PRs related to the spark load area/sql/function Issues or PRs related to the SQL functions area/statistics Issues or PRS related to statistics area/vectorization dev/backlog waiting to be merged in future dev branch kind/docs Categorizes issue or PR as related to documentation. kind/test reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Statistics derivation

5 participants