Skip to content

Conversation

@seawinde
Copy link
Contributor

Proposed changes

this is brought by #35436

in the method MaterializationContext#getPlanStatistics this get the materialization context orginal plan statistics.
but the expressionToColumnStats in statistics is the slot of original plan.
We want the statistics of original plan but the expressionToColumnStats in which should be mv scan plan based actually.
So add the method MaterializationContext#normalizeStatisticsColumnExpression. when after generate the PlanStatistics
in MaterializationContext, should call the normalizeStatisticsColumnExpression method.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@seawinde seawinde changed the title [fix](mtmv)Materialization statistics slot mapping [fix](mtmv) Mapping materialization statistics's expressionToColumnStats to mv scan plan based May 31, 2024
@seawinde
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

github-actions bot commented Jun 3, 2024

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Jun 3, 2024
@github-actions
Copy link
Contributor

github-actions bot commented Jun 3, 2024

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 41537 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d93dc934f64b2362618141e7716ad4cdfafbdbd0, data reload: false

------ Round 1 ----------------------------------
q1	17610	4275	4236	4236
q2	2031	193	206	193
q3	10426	1231	1182	1182
q4	10191	818	789	789
q5	7516	2704	2689	2689
q6	225	138	145	138
q7	956	631	613	613
q8	9223	2146	2102	2102
q9	9293	6714	6659	6659
q10	9213	3858	3955	3858
q11	463	251	244	244
q12	494	230	230	230
q13	17548	3268	3157	3157
q14	274	224	223	223
q15	511	495	485	485
q16	519	397	393	393
q17	1002	582	629	582
q18	8377	7874	7852	7852
q19	5485	1307	1209	1209
q20	682	309	332	309
q21	5240	4044	4347	4044
q22	412	350	352	350
Total cold run time: 117691 ms
Total hot run time: 41537 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4559	4384	4451	4384
q2	375	283	274	274
q3	3144	2955	2845	2845
q4	2042	1711	1570	1570
q5	5309	5503	5486	5486
q6	220	130	130	130
q7	2183	1864	1795	1795
q8	3191	3362	3352	3352
q9	8627	8621	8685	8621
q10	4038	3869	3870	3869
q11	607	511	526	511
q12	800	655	613	613
q13	15860	3042	3158	3042
q14	313	273	307	273
q15	523	476	485	476
q16	499	436	435	435
q17	1791	1510	1502	1502
q18	8062	8096	7398	7398
q19	1750	1603	1611	1603
q20	3030	1806	1779	1779
q21	4865	4704	4756	4704
q22	814	541	575	541
Total cold run time: 72602 ms
Total hot run time: 55203 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172459 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d93dc934f64b2362618141e7716ad4cdfafbdbd0, data reload: false

query1	932	371	361	361
query2	6452	2383	2317	2317
query3	6641	204	209	204
query4	19419	17418	17165	17165
query5	4117	442	438	438
query6	239	162	152	152
query7	4586	307	282	282
query8	325	298	278	278
query9	8494	2337	2336	2336
query10	450	289	276	276
query11	10616	10008	9939	9939
query12	133	84	90	84
query13	1627	378	369	369
query14	8592	7681	7655	7655
query15	228	215	186	186
query16	7650	271	260	260
query17	1342	511	527	511
query18	1932	267	269	267
query19	196	177	149	149
query20	92	88	86	86
query21	202	130	131	130
query22	4352	4260	3969	3969
query23	33497	32888	32988	32888
query24	11185	2877	2790	2790
query25	632	352	363	352
query26	1185	156	150	150
query27	3083	323	315	315
query28	7656	2038	2047	2038
query29	891	607	585	585
query30	271	151	149	149
query31	954	743	709	709
query32	94	54	52	52
query33	756	282	275	275
query34	972	461	503	461
query35	743	601	592	592
query36	1096	939	954	939
query37	162	67	70	67
query38	2923	2769	2703	2703
query39	845	777	803	777
query40	208	122	123	122
query41	53	51	52	51
query42	120	93	96	93
query43	597	503	546	503
query44	1221	736	751	736
query45	200	163	164	163
query46	1071	738	711	711
query47	1861	1752	1786	1752
query48	378	300	302	300
query49	1072	392	399	392
query50	776	380	391	380
query51	6915	6798	6821	6798
query52	99	96	88	88
query53	348	293	295	293
query54	874	469	436	436
query55	76	70	69	69
query56	266	248	257	248
query57	1152	1043	1051	1043
query58	248	275	233	233
query59	3311	3111	3321	3111
query60	290	267	283	267
query61	91	86	92	86
query62	636	441	433	433
query63	324	291	283	283
query64	8951	2242	1752	1752
query65	3190	3122	3109	3109
query66	809	323	332	323
query67	15596	15095	14961	14961
query68	4441	531	528	528
query69	460	301	294	294
query70	1168	1118	1132	1118
query71	419	283	276	276
query72	7152	5726	5419	5419
query73	750	323	323	323
query74	5924	5604	5463	5463
query75	3441	2596	2656	2596
query76	2736	928	912	912
query77	484	287	280	280
query78	10178	9634	9807	9634
query79	1999	503	505	503
query80	826	454	446	446
query81	593	225	218	218
query82	1063	102	99	99
query83	264	168	160	160
query84	247	83	88	83
query85	1570	274	277	274
query86	487	344	326	326
query87	3321	3090	3073	3073
query88	4525	2438	2437	2437
query89	466	383	376	376
query90	1815	187	189	187
query91	141	192	99	99
query92	58	48	48	48
query93	2241	514	489	489
query94	1224	187	185	185
query95	403	307	310	307
query96	600	264	275	264
query97	3196	3050	2994	2994
query98	247	216	218	216
query99	1214	840	880	840
Total cold run time: 269948 ms
Total hot run time: 172459 ms

@morrySnow
Copy link
Contributor

run p0

@doris-robot
Copy link

ClickBench: Total hot run time: 30.25 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d93dc934f64b2362618141e7716ad4cdfafbdbd0, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.04	0.04
query3	0.23	0.06	0.06
query4	1.64	0.08	0.07
query5	0.52	0.49	0.51
query6	1.12	0.72	0.71
query7	0.02	0.01	0.02
query8	0.06	0.04	0.04
query9	0.53	0.50	0.48
query10	0.54	0.55	0.53
query11	0.16	0.11	0.10
query12	0.14	0.12	0.12
query13	0.59	0.59	0.60
query14	0.77	0.77	0.79
query15	0.84	0.81	0.81
query16	0.35	0.36	0.38
query17	0.94	1.03	1.00
query18	0.21	0.23	0.25
query19	1.78	1.65	1.66
query20	0.01	0.01	0.01
query21	15.65	0.67	0.66
query22	4.74	6.88	1.70
query23	18.31	1.41	1.25
query24	1.35	0.39	0.24
query25	0.14	0.08	0.08
query26	0.27	0.17	0.17
query27	0.08	0.08	0.08
query28	13.30	1.02	1.00
query29	13.71	3.32	3.26
query30	0.23	0.06	0.05
query31	2.89	0.37	0.38
query32	3.30	0.46	0.47
query33	2.97	2.95	2.84
query34	17.19	4.43	4.46
query35	4.51	4.51	4.59
query36	0.68	0.46	0.46
query37	0.17	0.16	0.16
query38	0.15	0.15	0.15
query39	0.04	0.03	0.03
query40	0.17	0.14	0.15
query41	0.09	0.04	0.05
query42	0.05	0.04	0.04
query43	0.04	0.04	0.03
Total cold run time: 110.6 s
Total hot run time: 30.25 s

@morrySnow morrySnow merged commit f744276 into apache:master Jun 3, 2024
dataroaring pushed a commit that referenced this pull request Jun 4, 2024
…ats to mv scan plan based (#35749)

this is brought by #35436 

in the method `MaterializationContext#getPlanStatistics` this get the
materialization context orginal plan statistics.
but the `expressionToColumnStats` in statistics is the slot of original plan.
We want the statistics of original plan but the
`expressionToColumnStats` in which should be mv scan plan based actually.
So add the method
`MaterializationContext#normalizeStatisticsColumnExpression`. when after
generate the PlanStatistics in MaterializationContext, should call the
normalizeStatisticsColumnExpression method.
seawinde added a commit to seawinde/doris that referenced this pull request Jun 5, 2024
…ats to mv scan plan based (apache#35749)

this is brought by apache#35436 

in the method `MaterializationContext#getPlanStatistics` this get the
materialization context orginal plan statistics.
but the `expressionToColumnStats` in statistics is the slot of original plan.
We want the statistics of original plan but the
`expressionToColumnStats` in which should be mv scan plan based actually.
So add the method
`MaterializationContext#normalizeStatisticsColumnExpression`. when after
generate the PlanStatistics in MaterializationContext, should call the
normalizeStatisticsColumnExpression method.
seawinde added a commit to seawinde/doris that referenced this pull request Jun 7, 2024
…ats to mv scan plan based (apache#35749)

this is brought by apache#35436 

in the method `MaterializationContext#getPlanStatistics` this get the
materialization context orginal plan statistics.
but the `expressionToColumnStats` in statistics is the slot of original plan.
We want the statistics of original plan but the
`expressionToColumnStats` in which should be mv scan plan based actually.
So add the method
`MaterializationContext#normalizeStatisticsColumnExpression`. when after
generate the PlanStatistics in MaterializationContext, should call the
normalizeStatisticsColumnExpression method.
morningman pushed a commit to seawinde/doris that referenced this pull request Jun 8, 2024
…ats to mv scan plan based (apache#35749)

this is brought by apache#35436 

in the method `MaterializationContext#getPlanStatistics` this get the
materialization context orginal plan statistics.
but the `expressionToColumnStats` in statistics is the slot of original plan.
We want the statistics of original plan but the
`expressionToColumnStats` in which should be mv scan plan based actually.
So add the method
`MaterializationContext#normalizeStatisticsColumnExpression`. when after
generate the PlanStatistics in MaterializationContext, should call the
normalizeStatisticsColumnExpression method.
seawinde added a commit to seawinde/doris that referenced this pull request Jun 14, 2024
…ats to mv scan plan based (apache#35749)

this is brought by apache#35436 

in the method `MaterializationContext#getPlanStatistics` this get the
materialization context orginal plan statistics.
but the `expressionToColumnStats` in statistics is the slot of original plan.
We want the statistics of original plan but the
`expressionToColumnStats` in which should be mv scan plan based actually.
So add the method
`MaterializationContext#normalizeStatisticsColumnExpression`. when after
generate the PlanStatistics in MaterializationContext, should call the
normalizeStatisticsColumnExpression method.
morningman pushed a commit that referenced this pull request Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.4-merged dev/3.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants