branch-3.0: [test](mv) Insert into more data when first insert into to make sure using sync mv #43055

github-actions · 2024-11-01T06:10:24Z

PR Body: ## Proposed changes

Root Cause Analysis:
Currently, the statistics reported by BE (Backend) nodes have higher priority than those from ANALYZE statements. During the first INSERT INTO operation, the system waits for row count reports from all tablets before updating the table statistics.
Subsequent INSERT INTO operations cannot obtain the status of all tablets, so the system continues to use the statistical information from the first INSERT INTO operation. This leads to a lower estimated cost for the original table's query plan, resulting in the selection of the original table's query plan instead of the materialized view.

Conclusion:
The test case should be modified to include a larger dataset in the first INSERT INTO operation, which will increase the likelihood of utilizing the materialized view. This is because the cost estimation will better reflect the actual data distribution and size, leading to more accurate plan selection.

Cherry-picked from #43010

…using sync mv (#43010) Root Cause Analysis: Currently, the statistics reported by BE (Backend) nodes have higher priority than those from ANALYZE statements. During the first INSERT INTO operation, the system waits for row count reports from all tablets before updating the table statistics. Subsequent INSERT INTO operations cannot obtain the status of all tablets, so the system continues to use the statistical information from the first INSERT INTO operation. This leads to a lower estimated cost for the original table's query plan, resulting in the selection of the original table's query plan instead of the materialized view. Conclusion: The test case should be modified to include a larger dataset in the first INSERT INTO operation, which will increase the likelihood of utilizing the materialized view. This is because the cost estimation will better reflect the actual data distribution and size, leading to more accurate plan selection.

github-actions · 2024-11-01T06:10:25Z

run buildall

doris-robot · 2024-11-01T06:10:28Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

dataroaring · 2024-11-07T02:55:32Z

run buildall

dataroaring closed this Nov 7, 2024

dataroaring reopened this Nov 7, 2024

dataroaring merged commit 804c129 into branch-3.0 Nov 14, 2024

dataroaring added the dev/3.0.3-merged label Nov 14, 2024

dataroaring deleted the auto-pick-43010-branch-3.0 branch December 27, 2024 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

branch-3.0: [test](mv) Insert into more data when first insert into to make sure using sync mv #43055

branch-3.0: [test](mv) Insert into more data when first insert into to make sure using sync mv #43055

Uh oh!

github-actions bot commented Nov 1, 2024

Uh oh!

github-actions bot commented Nov 1, 2024

Uh oh!

doris-robot commented Nov 1, 2024

Uh oh!

dataroaring commented Nov 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

branch-3.0: [test](mv) Insert into more data when first insert into to make sure using sync mv #43055

branch-3.0: [test](mv) Insert into more data when first insert into to make sure using sync mv #43055

Uh oh!

Conversation

github-actions bot commented Nov 1, 2024

Uh oh!

github-actions bot commented Nov 1, 2024

Uh oh!

doris-robot commented Nov 1, 2024

Uh oh!

dataroaring commented Nov 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants