Skip to content

Conversation

@yujun777
Copy link
Contributor

@yujun777 yujun777 commented Feb 25, 2025

What problem does this PR solve?

When merge projections, if parent projection contain a slot and the slot exists multiple times, and the slot is an alias in child projection and its origin expression contains nonfoldable expression, then cann't merge these two projections. For example:

project(a as b, a as c) -> project(k + random() as a),

If merge these two projects, it will get project(k + random() as b, k + random() as c), this will calculate random() two times, then cause error.

But if the slot only occur one time, it can still merge the two projections. For example:

project(a + 100 as b) -> project(k + random() as a), after merge them, it will get project(k + random() + 100 as b).

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31766 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 08ac27749db33072b65189af21bf967997c3115b, data reload: false

------ Round 1 ----------------------------------
q1	17594	5258	5098	5098
q2	2046	300	177	177
q3	10410	1218	767	767
q4	10246	1031	545	545
q5	7863	2334	2352	2334
q6	193	166	136	136
q7	910	777	620	620
q8	9305	1298	1121	1121
q9	4905	4768	4624	4624
q10	6808	2285	1871	1871
q11	483	279	255	255
q12	345	367	231	231
q13	17762	3703	3064	3064
q14	225	227	207	207
q15	520	461	455	455
q16	614	618	581	581
q17	582	865	348	348
q18	6537	6309	6280	6280
q19	1347	946	564	564
q20	323	324	199	199
q21	2814	2206	1982	1982
q22	364	335	307	307
Total cold run time: 102196 ms
Total hot run time: 31766 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5194	5179	5227	5179
q2	236	326	240	240
q3	2181	2687	2224	2224
q4	1404	1855	1341	1341
q5	4240	4158	4134	4134
q6	203	165	127	127
q7	1854	1804	1670	1670
q8	2602	2732	2629	2629
q9	7275	7137	7106	7106
q10	2998	3184	2807	2807
q11	580	530	488	488
q12	680	775	644	644
q13	3395	4023	3329	3329
q14	277	295	270	270
q15	493	462	449	449
q16	625	695	661	661
q17	1143	1584	1352	1352
q18	7508	7310	7311	7310
q19	803	830	975	830
q20	1957	2027	1868	1868
q21	5446	5029	4814	4814
q22	644	572	551	551
Total cold run time: 51738 ms
Total hot run time: 50023 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191615 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 08ac27749db33072b65189af21bf967997c3115b, data reload: false

query1	1303	963	947	947
query2	6237	1895	1831	1831
query3	10941	4543	4549	4543
query4	52205	24327	23105	23105
query5	5290	587	512	512
query6	337	201	189	189
query7	4928	506	301	301
query8	330	247	237	237
query9	5799	2534	2524	2524
query10	438	317	261	261
query11	15073	14976	14898	14898
query12	160	109	115	109
query13	1069	525	378	378
query14	10071	7248	6987	6987
query15	204	196	190	190
query16	7043	652	494	494
query17	1092	749	599	599
query18	1506	420	334	334
query19	220	196	167	167
query20	129	134	123	123
query21	206	148	106	106
query22	4374	4583	4504	4504
query23	33779	33307	33366	33307
query24	5863	2436	2413	2413
query25	460	488	412	412
query26	836	283	154	154
query27	2108	496	344	344
query28	2959	2463	2435	2435
query29	613	568	429	429
query30	218	199	161	161
query31	944	917	818	818
query32	74	65	68	65
query33	459	358	305	305
query34	824	883	502	502
query35	807	869	749	749
query36	961	993	897	897
query37	124	104	79	79
query38	4259	4339	4148	4148
query39	1628	1468	1428	1428
query40	203	118	100	100
query41	53	49	53	49
query42	123	116	105	105
query43	496	504	478	478
query44	1381	826	827	826
query45	180	176	160	160
query46	887	1066	655	655
query47	1846	1871	1780	1780
query48	385	417	313	313
query49	717	546	418	418
query50	720	773	432	432
query51	4278	4345	4202	4202
query52	106	111	101	101
query53	238	262	191	191
query54	489	504	428	428
query55	85	80	80	80
query56	274	268	272	268
query57	1183	1192	1148	1148
query58	253	245	268	245
query59	2811	2885	2928	2885
query60	301	286	271	271
query61	127	121	127	121
query62	800	753	666	666
query63	240	193	196	193
query64	2310	1068	677	677
query65	3280	3259	3242	3242
query66	815	398	306	306
query67	16358	15594	15513	15513
query68	7008	876	508	508
query69	530	296	263	263
query70	1221	1114	1108	1108
query71	484	306	261	261
query72	5762	3611	3842	3611
query73	1384	745	359	359
query74	9206	9176	9143	9143
query75	3871	3180	2684	2684
query76	4333	1183	738	738
query77	726	383	278	278
query78	9881	10234	9334	9334
query79	2347	875	607	607
query80	805	530	457	457
query81	498	282	241	241
query82	653	123	95	95
query83	250	166	154	154
query84	286	103	72	72
query85	791	438	299	299
query86	332	290	290	290
query87	4588	4474	4301	4301
query88	3260	2211	2194	2194
query89	416	322	282	282
query90	1964	192	194	192
query91	152	147	108	108
query92	79	58	54	54
query93	1217	1039	587	587
query94	669	401	303	303
query95	354	275	261	261
query96	483	561	271	271
query97	3350	3398	3261	3261
query98	251	204	205	204
query99	1440	1381	1257	1257
Total cold run time: 295401 ms
Total hot run time: 191615 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.59 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 08ac27749db33072b65189af21bf967997c3115b, data reload: false

query1	0.04	0.03	0.03
query2	0.07	0.04	0.03
query3	0.23	0.07	0.07
query4	1.61	0.12	0.11
query5	0.56	0.54	0.54
query6	1.19	0.71	0.72
query7	0.02	0.02	0.01
query8	0.04	0.04	0.04
query9	0.59	0.53	0.53
query10	0.57	0.58	0.57
query11	0.15	0.10	0.10
query12	0.14	0.10	0.11
query13	0.61	0.60	0.60
query14	2.87	2.78	2.80
query15	0.93	0.86	0.85
query16	0.37	0.39	0.38
query17	1.03	1.04	0.99
query18	0.21	0.20	0.19
query19	1.89	1.81	2.02
query20	0.01	0.01	0.01
query21	15.36	0.90	0.54
query22	0.76	1.20	0.64
query23	15.37	1.40	0.66
query24	7.00	2.13	0.62
query25	0.46	0.22	0.18
query26	0.61	0.16	0.13
query27	0.06	0.05	0.05
query28	9.40	0.85	0.41
query29	12.56	3.92	3.27
query30	0.25	0.09	0.07
query31	2.84	0.59	0.39
query32	3.22	0.53	0.46
query33	2.99	2.98	3.07
query34	15.84	5.11	4.49
query35	4.51	4.50	4.46
query36	0.66	0.51	0.48
query37	0.09	0.06	0.06
query38	0.06	0.04	0.04
query39	0.03	0.02	0.02
query40	0.18	0.13	0.12
query41	0.07	0.02	0.03
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.52 s
Total hot run time: 30.59 s

@yujun777 yujun777 force-pushed the fix-merge-project-with-non-foldable branch from 08ac277 to 9314ef1 Compare February 26, 2025 02:49
@yujun777
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31542 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 01c112b95c32e6f3ff1bc9fd9013b65766ccb9f8, data reload: false

------ Round 1 ----------------------------------
q1	17627	5310	5029	5029
q2	2055	315	173	173
q3	10397	1248	750	750
q4	10283	1017	535	535
q5	8064	2337	2362	2337
q6	192	168	132	132
q7	892	738	602	602
q8	9311	1252	1124	1124
q9	5021	4756	4827	4756
q10	6813	2297	1881	1881
q11	467	273	249	249
q12	344	345	213	213
q13	17756	3749	3098	3098
q14	227	224	208	208
q15	512	479	454	454
q16	644	632	577	577
q17	574	856	339	339
q18	6517	6239	6110	6110
q19	2041	956	548	548
q20	307	312	192	192
q21	2734	2228	1930	1930
q22	370	341	305	305
Total cold run time: 103148 ms
Total hot run time: 31542 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5269	5149	5153	5149
q2	245	344	242	242
q3	2240	2760	2325	2325
q4	1477	1882	1418	1418
q5	4341	4215	4234	4215
q6	204	164	132	132
q7	1834	1830	1794	1794
q8	2622	2602	2591	2591
q9	7212	7124	7141	7124
q10	3000	3205	2791	2791
q11	576	517	482	482
q12	677	780	632	632
q13	3475	3873	3284	3284
q14	288	287	267	267
q15	510	472	451	451
q16	629	700	648	648
q17	1162	1632	1304	1304
q18	7626	7328	7328	7328
q19	800	900	1005	900
q20	1981	2039	1899	1899
q21	5441	5067	4697	4697
q22	627	569	558	558
Total cold run time: 52236 ms
Total hot run time: 50231 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190664 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 01c112b95c32e6f3ff1bc9fd9013b65766ccb9f8, data reload: false

query1	1296	964	939	939
query2	6220	1858	1874	1858
query3	11033	4648	4455	4455
query4	55332	26059	23335	23335
query5	5042	558	484	484
query6	352	198	191	191
query7	4898	515	294	294
query8	318	249	239	239
query9	5552	2609	2604	2604
query10	413	320	266	266
query11	15130	16511	15049	15049
query12	170	111	111	111
query13	1033	543	385	385
query14	10541	6434	6405	6405
query15	208	208	180	180
query16	7115	667	471	471
query17	1093	715	550	550
query18	1514	409	308	308
query19	195	190	161	161
query20	126	118	126	118
query21	219	123	115	115
query22	4543	4396	4128	4128
query23	34293	33390	33379	33379
query24	6103	2460	2447	2447
query25	458	496	412	412
query26	686	277	158	158
query27	2033	490	333	333
query28	3174	2475	2466	2466
query29	570	596	422	422
query30	215	228	166	166
query31	892	883	803	803
query32	70	66	60	60
query33	451	369	306	306
query34	759	866	520	520
query35	820	817	791	791
query36	959	1015	901	901
query37	117	101	73	73
query38	4176	4192	4277	4192
query39	1499	1459	1430	1430
query40	204	117	104	104
query41	53	55	52	52
query42	133	106	108	106
query43	515	520	458	458
query44	1343	802	816	802
query45	186	175	167	167
query46	893	1081	666	666
query47	1829	1844	1759	1759
query48	401	436	326	326
query49	690	530	431	431
query50	747	757	446	446
query51	4275	4341	4286	4286
query52	118	113	100	100
query53	234	267	193	193
query54	505	511	437	437
query55	91	84	89	84
query56	277	298	270	270
query57	1156	1178	1139	1139
query58	248	257	245	245
query59	2770	2827	2636	2636
query60	293	276	270	270
query61	123	124	126	124
query62	744	734	702	702
query63	229	194	186	186
query64	1491	1072	702	702
query65	3327	3211	3124	3124
query66	737	403	296	296
query67	15848	15368	15325	15325
query68	5699	845	511	511
query69	524	301	276	276
query70	1179	1127	1134	1127
query71	444	309	267	267
query72	5805	3607	3736	3607
query73	1390	732	348	348
query74	8914	9126	8803	8803
query75	3372	3145	2758	2758
query76	3769	1184	753	753
query77	534	386	294	294
query78	9921	10224	9370	9370
query79	1391	883	593	593
query80	605	552	440	440
query81	531	276	242	242
query82	193	122	100	100
query83	174	174	159	159
query84	293	92	76	76
query85	732	359	312	312
query86	341	313	287	287
query87	4424	4410	4361	4361
query88	2872	2229	2213	2213
query89	462	320	286	286
query90	1811	195	194	194
query91	131	140	110	110
query92	77	60	56	56
query93	1720	1053	581	581
query94	682	401	281	281
query95	379	265	260	260
query96	483	554	276	276
query97	3272	3356	3309	3309
query98	225	210	204	204
query99	1330	1392	1269	1269
Total cold run time: 292757 ms
Total hot run time: 190664 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.68 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 01c112b95c32e6f3ff1bc9fd9013b65766ccb9f8, data reload: false

query1	0.03	0.04	0.02
query2	0.07	0.03	0.04
query3	0.24	0.07	0.06
query4	1.63	0.11	0.11
query5	0.57	0.54	0.55
query6	1.20	0.72	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.59	0.52	0.54
query10	0.57	0.58	0.57
query11	0.16	0.11	0.11
query12	0.15	0.11	0.12
query13	0.62	0.60	0.59
query14	2.70	2.67	2.81
query15	0.94	0.85	0.86
query16	0.40	0.38	0.38
query17	1.04	1.05	1.03
query18	0.21	0.20	0.19
query19	1.91	1.87	1.94
query20	0.01	0.01	0.02
query21	15.39	0.92	0.56
query22	0.76	1.17	0.69
query23	14.91	1.40	0.65
query24	6.66	2.80	0.59
query25	0.51	0.19	0.15
query26	0.59	0.16	0.13
query27	0.06	0.05	0.04
query28	10.07	0.78	0.43
query29	12.55	3.90	3.26
query30	0.26	0.09	0.07
query31	2.81	0.59	0.38
query32	3.23	0.54	0.47
query33	3.02	3.07	3.01
query34	15.79	5.13	4.54
query35	4.51	4.49	4.58
query36	0.68	0.49	0.48
query37	0.09	0.06	0.06
query38	0.06	0.04	0.04
query39	0.04	0.03	0.02
query40	0.17	0.14	0.13
query41	0.09	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.42 s
Total hot run time: 30.68 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 26, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@starocean999 starocean999 merged commit 2ac6779 into apache:master Feb 26, 2025
27 of 29 checks passed
yiguolei pushed a commit that referenced this pull request Feb 27, 2025
dataroaring pushed a commit that referenced this pull request Feb 27, 2025
zhiqiang-hhhh pushed a commit to zhiqiang-hhhh/doris that referenced this pull request Feb 27, 2025
…ache#48321)

When merge projections, if parent projection contain a slot and the slot
exists multiple times, and the slot is an alias in child projection and
its origin expression contains nonfoldable expression, then cann't merge
these two projections. For example:

`project(a as b, a as c) -> project(k + random() as a)`,

If merge these two projects, it will get `project(k + random() as b, k +
random() as c)`, this will calculate random() two times, then cause
error.

But if the slot only occur one time, it can still merge the two
projections. For example:

`project(a + 100 as b) -> project(k + random() as a)`, after merge them,
it will get `project(k + random() + 100 as b)`.
seawinde pushed a commit to seawinde/doris that referenced this pull request Feb 28, 2025
…ache#48321)

When merge projections, if parent projection contain a slot and the slot
exists multiple times, and the slot is an alias in child projection and
its origin expression contains nonfoldable expression, then cann't merge
these two projections. For example:

`project(a as b, a as c) -> project(k + random() as a)`,

If merge these two projects, it will get `project(k + random() as b, k +
random() as c)`, this will calculate random() two times, then cause
error.

But if the slot only occur one time, it can still merge the two
projections. For example:

`project(a + 100 as b) -> project(k + random() as a)`, after merge them,
it will get `project(k + random() + 100 as b)`.
mymeiyi pushed a commit to mymeiyi/doris that referenced this pull request Mar 4, 2025
…ache#48321)

When merge projections, if parent projection contain a slot and the slot
exists multiple times, and the slot is an alias in child projection and
its origin expression contains nonfoldable expression, then cann't merge
these two projections. For example:

`project(a as b, a as c) -> project(k + random() as a)`,

If merge these two projects, it will get `project(k + random() as b, k +
random() as c)`, this will calculate random() two times, then cause
error.

But if the slot only occur one time, it can still merge the two
projections. For example:

`project(a + 100 as b) -> project(k + random() as a)`, after merge them,
it will get `project(k + random() + 100 as b)`.
@yiguolei yiguolei mentioned this pull request Mar 25, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…ache#48321)

When merge projections, if parent projection contain a slot and the slot
exists multiple times, and the slot is an alias in child projection and
its origin expression contains nonfoldable expression, then cann't merge
these two projections. For example:

`project(a as b, a as c) -> project(k + random() as a)`,

If merge these two projects, it will get `project(k + random() as b, k +
random() as c)`, this will calculate random() two times, then cause
error.

But if the slot only occur one time, it can still merge the two
projections. For example:

`project(a + 100 as b) -> project(k + random() as a)`, after merge them,
it will get `project(k + random() + 100 as b)`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.5-merged p0_w reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants