Skip to content

Conversation

@seawinde
Copy link
Contributor

What problem does this PR solve?

there is mv def is

     CREATE MATERIALIZED VIEW mv_1_name
     BUILD IMMEDIATE REFRESH AUTO ON MANUAL
     partition by(l_shipdate)
     DISTRIBUTED BY RANDOM BUCKETS 2
     PROPERTIES (
         'replication_num' = '1',
         'grace_period' = '31536000')
     AS
    select l_shipdate, o_orderdate, l_partkey,
 l_suppkey, sum(o_totalprice) as sum_total
 from lineitem
 left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
 group by
 l_shipdate,
 o_orderdate,
 l_partkey,
 l_suppkey;

aflter mv_1_name refreshed, we insert into data to lineitem

    insert into lineitem values
    (1, 2, 3, 4, 5.5, 6.5, 7.5, 8.5, 'o', 'k', '2023-10-17', '2023-10-17', '2023-10-17', 'a', 'b', 'yyyyyyyyy');

if another mv_2_name which def is the same with the mv_1_name
when we refresh mv_2_name should contain the new inserted data, but now not, the pr fix this

Issue Number: close #xxx

Related PR: #38115

Problem Summary:

Release note

Fix union all rewrite wrongly when mv refresh which grace_period is big

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented May 12, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@seawinde
Copy link
Contributor Author

run buildall

@morrySnow morrySnow added usercase Important user case type label dev/2.1.x dev/3.0.x labels May 12, 2025
@morrySnow morrySnow requested a review from Copilot May 12, 2025 09:48
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue with union all rewriting during materialized view refresh when the grace_period is large. Key changes include updating test outputs for union rewriting, improving logging to use sqlHash instead of queryId, caching valid MV partitions in the StatementContext, and adding a check on the materialized view property use_for_rewrite in the MV relation manager.

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
regression-test/data/nereids_rules_p0/mv/union_rewrite_grace_big/unioin_rewrite_grace_big.out Added expected output for union rewriting tests.
fe/fe-core/src/test/java/org/apache/doris/nereids/mv/OptimizeGetAvailableMvsTest.java Added tests for both partition prune and non-partition prune scenarios.
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/InitMaterializationContextHook.java Updated log messages to use sqlHash instead of queryId.
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/exploration/mv/AbstractMaterializedViewRule.java Changed MV partition retrieval to use the cached map from StatementContext.
fe/fe-core/src/main/java/org/apache/doris/nereids/StatementContext.java Added a new map to cache MV available rewrite partitions.
fe/fe-core/src/main/java/org/apache/doris/mtmv/MTMVRelationManager.java Added a check for the use_for_rewrite property and updated partition validation accordingly.
Comments suppressed due to low confidence (1)

regression-test/data/nereids_rules_p0/mv/union_rewrite_grace_big/unioin_rewrite_grace_big.out:1

  • The filename 'unioin_rewrite_grace_big.out' appears to be misspelled; consider renaming it to 'union_rewrite_grace_big.out' for clarity.
+-- This file is automatically generated. You should know what you did if you want to edit this

@doris-robot
Copy link

TPC-H: Total hot run time: 34182 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3333d8229496c7a532b8a7fcc3facdad4d22d213, data reload: false

------ Round 1 ----------------------------------
q1	26347	5059	5025	5025
q2	2063	276	184	184
q3	10402	1250	703	703
q4	10222	990	510	510
q5	7550	2368	2387	2368
q6	182	167	132	132
q7	897	747	617	617
q8	9322	1297	1214	1214
q9	6821	5178	5103	5103
q10	6825	2342	1882	1882
q11	479	291	268	268
q12	336	350	207	207
q13	17761	3697	3048	3048
q14	230	231	220	220
q15	549	497	475	475
q16	422	431	381	381
q17	621	876	390	390
q18	7629	7201	7203	7201
q19	1218	952	542	542
q20	346	338	237	237
q21	4130	3353	2470	2470
q22	1102	1058	1005	1005
Total cold run time: 115454 ms
Total hot run time: 34182 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5122	5099	5051	5051
q2	249	326	234	234
q3	2179	2682	2213	2213
q4	1380	1775	1306	1306
q5	4588	4467	4385	4385
q6	233	165	125	125
q7	1987	1915	1744	1744
q8	2588	2538	2489	2489
q9	7173	7139	7160	7139
q10	3054	3222	2750	2750
q11	579	501	489	489
q12	662	780	621	621
q13	3485	3873	3336	3336
q14	282	285	274	274
q15	523	482	473	473
q16	447	502	460	460
q17	1161	1515	1376	1376
q18	7897	7574	7387	7387
q19	776	778	891	778
q20	1951	2108	1810	1810
q21	5013	4689	4428	4428
q22	1067	1064	996	996
Total cold run time: 52396 ms
Total hot run time: 49864 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187050 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3333d8229496c7a532b8a7fcc3facdad4d22d213, data reload: false

query1	1027	476	478	476
query2	6574	1840	1774	1774
query3	6750	223	237	223
query4	26135	23298	23221	23221
query5	4356	636	458	458
query6	312	219	199	199
query7	4622	486	292	292
query8	319	250	244	244
query9	8611	2664	2678	2664
query10	443	336	299	299
query11	15363	15163	14804	14804
query12	168	112	109	109
query13	1653	513	404	404
query14	8720	6380	6183	6183
query15	220	188	170	170
query16	7137	640	522	522
query17	968	705	553	553
query18	1958	413	318	318
query19	184	182	156	156
query20	124	112	117	112
query21	210	119	107	107
query22	4175	4242	4153	4153
query23	33997	33220	33185	33185
query24	8255	2413	2422	2413
query25	542	461	393	393
query26	1229	267	154	154
query27	2740	485	333	333
query28	4347	2129	2114	2114
query29	766	542	429	429
query30	294	216	189	189
query31	909	876	761	761
query32	73	67	69	67
query33	556	385	319	319
query34	804	851	525	525
query35	790	812	755	755
query36	955	999	907	907
query37	115	98	79	79
query38	4226	4196	4085	4085
query39	1514	1410	1437	1410
query40	218	126	113	113
query41	62	59	57	57
query42	122	113	110	110
query43	523	506	474	474
query44	1316	825	821	821
query45	184	175	172	172
query46	854	1022	658	658
query47	1781	1812	1735	1735
query48	388	437	313	313
query49	798	524	457	457
query50	668	680	402	402
query51	4145	4197	4061	4061
query52	113	107	104	104
query53	225	252	188	188
query54	598	590	524	524
query55	90	88	91	88
query56	374	311	290	290
query57	1151	1148	1074	1074
query58	270	257	255	255
query59	2653	2581	2623	2581
query60	318	319	307	307
query61	129	125	124	124
query62	814	729	659	659
query63	225	193	197	193
query64	4372	1004	692	692
query65	4343	4286	4260	4260
query66	1145	412	316	316
query67	15903	15679	15531	15531
query68	8070	870	516	516
query69	480	314	268	268
query70	1161	1161	1106	1106
query71	446	318	297	297
query72	5834	4780	5051	4780
query73	765	752	357	357
query74	8812	9250	8801	8801
query75	3781	3219	2843	2843
query76	3621	1191	763	763
query77	795	391	281	281
query78	10269	10158	9334	9334
query79	3910	810	557	557
query80	637	504	456	456
query81	493	254	225	225
query82	581	127	94	94
query83	285	251	239	239
query84	301	104	82	82
query85	794	352	311	311
query86	404	311	282	282
query87	4442	4443	4312	4312
query88	3348	2288	2273	2273
query89	444	318	295	295
query90	1829	204	208	204
query91	142	143	113	113
query92	73	63	58	58
query93	3128	946	574	574
query94	670	412	300	300
query95	381	297	286	286
query96	481	577	280	280
query97	3199	3187	3137	3137
query98	231	217	209	209
query99	1462	1396	1299	1299
Total cold run time: 277192 ms
Total hot run time: 187050 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.02 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3333d8229496c7a532b8a7fcc3facdad4d22d213, data reload: false

query1	0.03	0.04	0.03
query2	0.12	0.10	0.10
query3	0.25	0.19	0.19
query4	1.59	0.19	0.11
query5	0.56	0.57	0.56
query6	1.17	0.73	0.71
query7	0.03	0.02	0.02
query8	0.04	0.04	0.04
query9	0.57	0.51	0.50
query10	0.58	0.57	0.56
query11	0.16	0.11	0.11
query12	0.15	0.11	0.12
query13	0.61	0.60	0.60
query14	0.79	0.80	0.80
query15	0.88	0.86	0.85
query16	0.39	0.38	0.38
query17	1.06	1.03	1.02
query18	0.23	0.23	0.22
query19	1.93	1.82	1.81
query20	0.02	0.01	0.01
query21	15.40	0.94	0.54
query22	0.76	1.18	0.64
query23	14.94	1.37	0.62
query24	7.26	1.33	0.60
query25	0.52	0.19	0.14
query26	0.67	0.16	0.14
query27	0.05	0.05	0.05
query28	9.72	0.92	0.43
query29	12.56	4.00	3.30
query30	0.26	0.10	0.07
query31	2.81	0.60	0.38
query32	3.24	0.57	0.49
query33	3.07	3.04	3.06
query34	15.94	5.11	4.52
query35	4.56	4.62	4.53
query36	0.68	0.50	0.48
query37	0.08	0.06	0.06
query38	0.04	0.04	0.04
query39	0.03	0.02	0.02
query40	0.17	0.14	0.13
query41	0.07	0.02	0.03
query42	0.04	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 104.07 s
Total hot run time: 29.02 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants