Skip to content

Conversation

@seawinde
Copy link
Contributor

What problem does this PR solve?

if mv is defined as following

        CREATE MATERIALIZED VIEW mv_11
        BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL 
        DISTRIBUTED BY RANDOM BUCKETS 2 
        PROPERTIES ('replication_num' = '1')  
        AS
            select o_orderstatus, o_orderdate, o_orderpriority,
            sum(o_totalprice) as sum_total,
            max(o_totalprice) as max_total,
            min(o_totalprice) as min_total,
            count(*) as count_all,
            bitmap_union(to_bitmap(case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)) as bitmap_union_basic
            from orders
            where o_custkey > 1
            group by
            o_orderstatus, o_orderdate, o_orderpriority;

there is filter where o_custkey > 1 in mv, if query is as following, should be rewritten successfully but fail, because the filter
o_custkey > 1 is lost compare and could not compensate the filter, the pr fixed this.

            select o_orderstatus, o_orderpriority,
            grouping_id(o_orderstatus, o_orderpriority),
            grouping_id(o_orderstatus),
            grouping(o_orderstatus),
            sum(o_totalprice),
            max(o_totalprice),
            min(o_totalprice),
            count(*),
            count(distinct case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)
            from orders
            where o_custkey > 1
            group by
            ROLLUP (o_orderstatus, o_orderpriority);

Issue Number: close #xxx

Related PR: #36056

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Oct 27, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 28.26 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f274dd17c7eaf8086c90e3c3fae207967ab6f621, data reload: false

query1	0.06	0.05	0.06
query2	0.11	0.05	0.06
query3	0.26	0.08	0.08
query4	1.62	0.13	0.13
query5	0.30	0.26	0.26
query6	1.19	0.66	0.66
query7	0.04	0.03	0.03
query8	0.07	0.05	0.05
query9	0.62	0.53	0.52
query10	0.60	0.58	0.58
query11	0.17	0.15	0.12
query12	0.16	0.13	0.13
query13	0.63	0.61	0.60
query14	1.03	1.03	1.03
query15	0.87	0.85	0.85
query16	0.40	0.40	0.40
query17	1.05	1.06	1.05
query18	0.22	0.20	0.21
query19	1.95	1.85	1.89
query20	0.03	0.02	0.02
query21	15.45	0.17	0.13
query22	5.10	0.08	0.05
query23	15.66	0.28	0.11
query24	2.92	0.78	1.21
query25	0.08	0.07	0.07
query26	0.14	0.13	0.15
query27	0.07	0.06	0.06
query28	4.95	1.17	0.95
query29	12.56	3.97	3.32
query30	0.33	0.18	0.13
query31	2.82	0.61	0.40
query32	3.24	0.56	0.48
query33	3.06	3.03	3.02
query34	15.77	5.12	4.51
query35	4.56	4.58	4.59
query36	0.70	0.51	0.50
query37	0.11	0.07	0.07
query38	0.08	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.14
query41	0.09	0.04	0.04
query42	0.05	0.04	0.03
query43	0.05	0.04	0.03
Total cold run time: 99.39 s
Total hot run time: 28.26 s

@seawinde
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 189302 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 44ef9697cd046f439744a497db7e48227b7ec8f8, data reload: false

query1	1067	437	410	410
query2	6568	1739	1706	1706
query3	6762	227	224	224
query4	26247	23771	23032	23032
query5	4466	673	485	485
query6	347	249	234	234
query7	4651	497	306	306
query8	313	273	268	268
query9	8739	2670	2597	2597
query10	512	366	284	284
query11	15808	15056	14844	14844
query12	189	119	117	117
query13	1693	595	441	441
query14	12843	9405	9362	9362
query15	257	192	183	183
query16	7812	685	526	526
query17	1656	793	657	657
query18	2762	432	350	350
query19	306	233	182	182
query20	146	132	123	123
query21	212	142	117	117
query22	4506	4768	4454	4454
query23	35052	34371	33765	33765
query24	8093	2541	2501	2501
query25	604	524	478	478
query26	795	297	163	163
query27	2934	524	382	382
query28	4286	2230	2210	2210
query29	768	654	494	494
query30	305	231	214	214
query31	949	841	795	795
query32	81	76	83	76
query33	597	382	333	333
query34	806	849	513	513
query35	830	827	728	728
query36	935	1021	893	893
query37	138	125	90	90
query38	3490	3571	3487	3487
query39	1485	1404	1423	1404
query40	219	135	119	119
query41	60	58	58	58
query42	128	108	105	105
query43	491	504	467	467
query44	1218	734	731	731
query45	179	182	175	175
query46	891	990	649	649
query47	1737	1802	1694	1694
query48	400	420	331	331
query49	724	506	405	405
query50	653	690	409	409
query51	3868	3975	3830	3830
query52	118	112	96	96
query53	240	283	196	196
query54	605	598	525	525
query55	88	86	93	86
query56	340	329	312	312
query57	1148	1187	1110	1110
query58	288	272	279	272
query59	2510	2657	2527	2527
query60	342	343	329	329
query61	156	160	153	153
query62	794	731	654	654
query63	243	197	190	190
query64	3535	1156	853	853
query65	4020	3952	3960	3952
query66	907	439	345	345
query67	15629	14953	15022	14953
query68	8391	952	589	589
query69	494	394	301	301
query70	1340	1182	1252	1182
query71	514	350	315	315
query72	6001	4970	4961	4961
query73	722	626	362	362
query74	9201	9239	8736	8736
query75	4115	3390	2782	2782
query76	3731	1174	740	740
query77	863	409	306	306
query78	9460	9655	8858	8858
query79	1826	860	594	594
query80	715	575	518	518
query81	500	260	225	225
query82	440	181	139	139
query83	316	270	252	252
query84	323	111	92	92
query85	899	495	440	440
query86	338	325	323	323
query87	3692	3719	3707	3707
query88	2878	2260	2205	2205
query89	397	329	295	295
query90	2047	219	221	219
query91	167	169	141	141
query92	87	67	69	67
query93	1174	1004	644	644
query94	698	429	341	341
query95	427	321	311	311
query96	496	601	287	287
query97	2936	2922	2875	2875
query98	228	212	209	209
query99	1432	1388	1332	1332
Total cold run time: 278389 ms
Total hot run time: 189302 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.68 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 44ef9697cd046f439744a497db7e48227b7ec8f8, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.05	0.05
query3	0.26	0.09	0.08
query4	1.61	0.11	0.12
query5	0.27	0.26	0.25
query6	1.20	0.67	0.64
query7	0.03	0.03	0.02
query8	0.06	0.05	0.04
query9	0.62	0.53	0.53
query10	0.58	0.57	0.57
query11	0.17	0.11	0.11
query12	0.15	0.16	0.12
query13	0.62	0.61	0.61
query14	1.00	1.02	0.99
query15	0.86	0.84	0.87
query16	0.40	0.39	0.40
query17	1.03	1.01	1.01
query18	0.23	0.20	0.20
query19	1.89	1.72	1.79
query20	0.02	0.01	0.01
query21	15.44	0.19	0.12
query22	5.12	0.08	0.05
query23	15.69	0.26	0.11
query24	2.33	0.63	0.85
query25	0.09	0.06	0.05
query26	0.14	0.13	0.13
query27	0.06	0.06	0.06
query28	5.18	1.14	0.93
query29	12.58	4.08	3.33
query30	0.28	0.14	0.11
query31	2.81	0.58	0.38
query32	3.24	0.54	0.47
query33	3.00	3.04	3.04
query34	15.87	5.17	4.54
query35	4.51	4.53	4.58
query36	0.70	0.50	0.49
query37	0.09	0.07	0.07
query38	0.06	0.04	0.04
query39	0.03	0.03	0.03
query40	0.17	0.14	0.15
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 98.72 s
Total hot run time: 27.68 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100% (0/0) 🎉
Increment coverage report
Complete coverage report

Copy link
Contributor

@zddr zddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

private static class PredicateCollector extends DefaultPlanVisitor<Void, Set<Expression>> {
@Override
public Void visit(Plan plan, Set<Expression> predicates) {
// Just collect the filter in top plan, if meet other node except project and filter, return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: update this comment

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 3, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 3, 2025

PR approved by at least one committer and no changes requested.

@morrySnow morrySnow merged commit 4cef3b1 into apache:master Nov 3, 2025
28 of 29 checks passed
github-actions bot pushed a commit that referenced this pull request Nov 3, 2025
…r above scan (#57343)

### What problem does this PR solve?

Related PR: #36056 

Problem Summary:

if mv is defined as following

        CREATE MATERIALIZED VIEW mv_11
        BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL 
        DISTRIBUTED BY RANDOM BUCKETS 2 
        PROPERTIES ('replication_num' = '1')  
        AS
            select o_orderstatus, o_orderdate, o_orderpriority,
            sum(o_totalprice) as sum_total,
            max(o_totalprice) as max_total,
            min(o_totalprice) as min_total,
            count(*) as count_all,
            bitmap_union(to_bitmap(case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)) as bitmap_union_basic
            from orders
            where o_custkey > 1
            group by
            o_orderstatus, o_orderdate, o_orderpriority;
 
there is filter `where o_custkey > 1` in mv, if query is as following,
should be rewritten successfully but fail, because the filter
`o_custkey > 1` is lost compare and could not compensate the filter, the
pr fixed this.

            select o_orderstatus, o_orderpriority,
            grouping_id(o_orderstatus, o_orderpriority),
            grouping_id(o_orderstatus),
            grouping(o_orderstatus),
            sum(o_totalprice),
            max(o_totalprice),
            min(o_totalprice),
            count(*),
            count(distinct case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)
            from orders
            where o_custkey > 1
            group by
            ROLLUP (o_orderstatus, o_orderpriority);
github-actions bot pushed a commit that referenced this pull request Nov 3, 2025
…r above scan (#57343)

### What problem does this PR solve?

Related PR: #36056 

Problem Summary:

if mv is defined as following

        CREATE MATERIALIZED VIEW mv_11
        BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL 
        DISTRIBUTED BY RANDOM BUCKETS 2 
        PROPERTIES ('replication_num' = '1')  
        AS
            select o_orderstatus, o_orderdate, o_orderpriority,
            sum(o_totalprice) as sum_total,
            max(o_totalprice) as max_total,
            min(o_totalprice) as min_total,
            count(*) as count_all,
            bitmap_union(to_bitmap(case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)) as bitmap_union_basic
            from orders
            where o_custkey > 1
            group by
            o_orderstatus, o_orderdate, o_orderpriority;
 
there is filter `where o_custkey > 1` in mv, if query is as following,
should be rewritten successfully but fail, because the filter
`o_custkey > 1` is lost compare and could not compensate the filter, the
pr fixed this.

            select o_orderstatus, o_orderpriority,
            grouping_id(o_orderstatus, o_orderpriority),
            grouping_id(o_orderstatus),
            grouping(o_orderstatus),
            sum(o_totalprice),
            max(o_totalprice),
            min(o_totalprice),
            count(*),
            count(distinct case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)
            from orders
            where o_custkey > 1
            group by
            ROLLUP (o_orderstatus, o_orderpriority);
seawinde added a commit to seawinde/doris that referenced this pull request Nov 4, 2025
…r above scan (apache#57343)

### What problem does this PR solve?

Related PR: apache#36056 

Problem Summary:

if mv is defined as following

        CREATE MATERIALIZED VIEW mv_11
        BUILD IMMEDIATE REFRESH COMPLETE ON MANUAL 
        DISTRIBUTED BY RANDOM BUCKETS 2 
        PROPERTIES ('replication_num' = '1')  
        AS
            select o_orderstatus, o_orderdate, o_orderpriority,
            sum(o_totalprice) as sum_total,
            max(o_totalprice) as max_total,
            min(o_totalprice) as min_total,
            count(*) as count_all,
            bitmap_union(to_bitmap(case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)) as bitmap_union_basic
            from orders
            where o_custkey > 1
            group by
            o_orderstatus, o_orderdate, o_orderpriority;
 
there is filter `where o_custkey > 1` in mv, if query is as following,
should be rewritten successfully but fail, because the filter
`o_custkey > 1` is lost compare and could not compensate the filter, the
pr fixed this.

            select o_orderstatus, o_orderpriority,
            grouping_id(o_orderstatus, o_orderpriority),
            grouping_id(o_orderstatus),
            grouping(o_orderstatus),
            sum(o_totalprice),
            max(o_totalprice),
            min(o_totalprice),
            count(*),
            count(distinct case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)
            from orders
            where o_custkey > 1
            group by
            ROLLUP (o_orderstatus, o_orderpriority);
yiguolei pushed a commit that referenced this pull request Nov 5, 2025
…ts and filter above scan #57343 (#57617)

Cherry-picked from #57343

Co-authored-by: seawinde <wusi@selectdb.com>
morrySnow pushed a commit that referenced this pull request Nov 6, 2025
@yiguolei yiguolei mentioned this pull request Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.3-merged dev/4.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants