Skip to content

Conversation

@kaka11chen
Copy link
Contributor

@kaka11chen kaka11chen commented Apr 7, 2025

What problem does this PR solve?

Problem Summary:
The current orc pushdown and late materialization conditions are connected together. The conditions that can be pushed down must be used for late materialization conditions. This is unreasonable. The two should be orthogonal.

Release note

  • Fix orc lazy materialization should not be bundled with pushdown.
  • Fix materialization for hive acid table.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Apr 7, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34495 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1a08075da313e5e804ad338eb92fec7e9e4923d6, data reload: false

------ Round 1 ----------------------------------
q1	25902	5022	4993	4993
q2	2056	285	180	180
q3	10383	1282	692	692
q4	10229	1016	552	552
q5	7487	2378	2408	2378
q6	181	166	128	128
q7	908	746	594	594
q8	9308	1367	1194	1194
q9	6890	5246	5142	5142
q10	6821	2310	1913	1913
q11	475	288	257	257
q12	360	356	224	224
q13	17774	3752	3123	3123
q14	229	224	207	207
q15	524	490	489	489
q16	620	621	577	577
q17	617	875	360	360
q18	7728	7474	7186	7186
q19	1488	945	601	601
q20	344	332	223	223
q21	4205	3340	2509	2509
q22	1082	1040	973	973
Total cold run time: 115611 ms
Total hot run time: 34495 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5159	5095	5077	5077
q2	252	331	236	236
q3	2164	2682	2277	2277
q4	1450	1849	1514	1514
q5	4604	4478	4404	4404
q6	216	166	122	122
q7	1990	1920	1743	1743
q8	2604	2584	2506	2506
q9	7247	7190	7213	7190
q10	2952	3179	2747	2747
q11	571	508	481	481
q12	711	784	634	634
q13	3416	3969	3324	3324
q14	280	283	293	283
q15	516	477	466	466
q16	670	672	642	642
q17	1159	1544	1378	1378
q18	7741	7608	7391	7391
q19	822	816	864	816
q20	1964	2013	1865	1865
q21	5209	4716	4660	4660
q22	1064	1043	962	962
Total cold run time: 52761 ms
Total hot run time: 50718 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186432 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1a08075da313e5e804ad338eb92fec7e9e4923d6, data reload: false

query1	1040	504	476	476
query2	6541	1951	1904	1904
query3	6750	222	217	217
query4	25953	23285	23196	23196
query5	4393	639	468	468
query6	315	215	197	197
query7	4621	492	278	278
query8	300	255	244	244
query9	8640	2614	2584	2584
query10	494	325	255	255
query11	15607	15217	14948	14948
query12	153	112	109	109
query13	1656	533	410	410
query14	8771	6105	6194	6105
query15	202	193	173	173
query16	7257	646	463	463
query17	1224	709	583	583
query18	1986	408	305	305
query19	196	191	161	161
query20	121	119	118	118
query21	217	123	104	104
query22	4299	4269	4233	4233
query23	34302	33099	32953	32953
query24	8478	2364	2391	2364
query25	532	444	393	393
query26	1230	265	151	151
query27	2722	513	324	324
query28	4303	2443	2387	2387
query29	763	568	431	431
query30	283	219	187	187
query31	927	867	754	754
query32	73	61	68	61
query33	559	373	307	307
query34	842	860	516	516
query35	795	815	722	722
query36	936	1025	917	917
query37	133	97	76	76
query38	4212	4207	3960	3960
query39	1445	1392	1408	1392
query40	216	120	111	111
query41	57	53	52	52
query42	132	105	108	105
query43	489	509	481	481
query44	1281	785	781	781
query45	180	177	166	166
query46	847	1024	613	613
query47	1754	1804	1732	1732
query48	369	411	306	306
query49	770	492	412	412
query50	656	679	408	408
query51	4166	4219	4123	4123
query52	106	105	97	97
query53	227	256	181	181
query54	576	577	503	503
query55	82	82	81	81
query56	295	320	288	288
query57	1138	1135	1084	1084
query58	261	262	273	262
query59	2611	2689	2652	2652
query60	315	322	305	305
query61	131	125	126	125
query62	765	754	631	631
query63	225	184	182	182
query64	4314	1002	675	675
query65	4287	4247	4183	4183
query66	1167	428	311	311
query67	15832	15553	15493	15493
query68	8115	891	518	518
query69	477	294	262	262
query70	1220	1141	1124	1124
query71	490	316	299	299
query72	5832	4798	4903	4798
query73	730	692	347	347
query74	8903	8840	8626	8626
query75	3943	3190	2665	2665
query76	3723	1191	769	769
query77	777	374	379	374
query78	10128	9978	9297	9297
query79	2899	815	584	584
query80	797	499	443	443
query81	493	257	225	225
query82	484	128	98	98
query83	296	248	235	235
query84	301	111	88	88
query85	779	348	378	348
query86	384	296	275	275
query87	4441	4433	4323	4323
query88	3657	2253	2254	2253
query89	396	317	286	286
query90	1843	220	211	211
query91	136	144	112	112
query92	82	68	59	59
query93	2157	960	585	585
query94	675	410	313	313
query95	376	292	286	286
query96	499	563	281	281
query97	3195	3226	3099	3099
query98	244	217	219	217
query99	1420	1392	1263	1263
Total cold run time: 276412 ms
Total hot run time: 186432 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.84 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1a08075da313e5e804ad338eb92fec7e9e4923d6, data reload: false

query1	0.04	0.03	0.03
query2	0.12	0.10	0.11
query3	0.25	0.19	0.19
query4	1.59	0.19	0.20
query5	0.62	0.58	0.59
query6	1.21	0.71	0.71
query7	0.03	0.02	0.01
query8	0.04	0.04	0.03
query9	0.57	0.55	0.52
query10	0.58	0.59	0.57
query11	0.16	0.11	0.10
query12	0.15	0.11	0.11
query13	0.61	0.59	0.60
query14	2.69	2.68	2.70
query15	0.95	0.87	0.84
query16	0.39	0.40	0.38
query17	1.05	1.06	0.98
query18	0.21	0.19	0.19
query19	1.87	1.87	1.87
query20	0.01	0.00	0.01
query21	15.36	0.91	0.55
query22	0.76	1.15	0.60
query23	15.05	1.40	0.58
query24	7.01	1.36	0.52
query25	0.47	0.15	0.20
query26	0.56	0.17	0.13
query27	0.06	0.06	0.05
query28	9.74	0.89	0.44
query29	12.60	4.00	3.30
query30	0.25	0.09	0.07
query31	2.82	0.61	0.39
query32	3.22	0.54	0.46
query33	3.03	3.10	3.08
query34	15.68	5.09	4.54
query35	4.56	4.56	4.53
query36	0.66	0.50	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.02
query40	0.17	0.13	0.12
query41	0.07	0.03	0.02
query42	0.03	0.03	0.02
query43	0.04	0.03	0.02
Total cold run time: 105.44 s
Total hot run time: 30.84 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.25% (14019/26831)
Line Coverage 41.04% (121046/294950)
Region Coverage 39.77% (61591/154880)
Branch Coverage 34.44% (30836/89542)

@kaka11chen kaka11chen force-pushed the orc_late_mat_without_sarg branch from 1a08075 to 63dacd1 Compare April 7, 2025 13:03
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33979 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 63dacd1e6cc2b0901241e668562fe5bcfeec9f1b, data reload: false

------ Round 1 ----------------------------------
q1	25868	5057	4961	4961
q2	2064	280	183	183
q3	10384	1248	689	689
q4	10216	1006	538	538
q5	7519	2373	2322	2322
q6	189	159	130	130
q7	924	755	610	610
q8	9305	1269	1078	1078
q9	6845	5156	5055	5055
q10	6850	2310	1917	1917
q11	482	280	261	261
q12	348	351	216	216
q13	17777	3651	3107	3107
q14	217	222	208	208
q15	523	479	486	479
q16	627	610	574	574
q17	579	845	353	353
q18	7430	7123	7134	7123
q19	1705	936	553	553
q20	336	330	217	217
q21	4072	3382	2435	2435
q22	1063	1018	970	970
Total cold run time: 115323 ms
Total hot run time: 33979 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5141	5085	5056	5056
q2	241	323	239	239
q3	2185	2613	2282	2282
q4	1391	1809	1375	1375
q5	4406	4355	4417	4355
q6	216	165	130	130
q7	1980	1891	1739	1739
q8	2580	2558	2556	2556
q9	7250	7241	6988	6988
q10	2988	3171	2716	2716
q11	581	508	503	503
q12	667	750	646	646
q13	3472	3904	3262	3262
q14	282	301	294	294
q15	520	481	456	456
q16	649	692	655	655
q17	1133	1581	1342	1342
q18	7680	7589	7475	7475
q19	789	811	862	811
q20	1931	1968	1828	1828
q21	5170	4819	4807	4807
q22	1100	1090	1040	1040
Total cold run time: 52352 ms
Total hot run time: 50555 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193223 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 63dacd1e6cc2b0901241e668562fe5bcfeec9f1b, data reload: false

query1	1408	1059	1033	1033
query2	6045	1891	1911	1891
query3	11016	4620	4559	4559
query4	54692	25117	23229	23229
query5	5063	553	446	446
query6	323	211	192	192
query7	4905	492	269	269
query8	308	239	227	227
query9	5291	2582	2570	2570
query10	449	308	259	259
query11	15019	15002	15134	15002
query12	161	109	105	105
query13	1009	502	387	387
query14	10063	6271	6254	6254
query15	200	192	186	186
query16	7053	652	526	526
query17	1112	756	607	607
query18	1565	421	333	333
query19	207	201	173	173
query20	127	132	130	130
query21	215	128	107	107
query22	4576	4622	4420	4420
query23	34148	33434	33747	33434
query24	6590	2419	2384	2384
query25	457	480	406	406
query26	709	277	150	150
query27	2288	509	341	341
query28	3200	2449	2458	2449
query29	556	561	413	413
query30	282	233	195	195
query31	861	864	785	785
query32	77	64	66	64
query33	496	362	315	315
query34	772	880	513	513
query35	814	825	781	781
query36	962	996	910	910
query37	117	100	76	76
query38	4133	4166	4112	4112
query39	1518	1450	1479	1450
query40	205	125	108	108
query41	52	52	50	50
query42	125	105	106	105
query43	504	517	482	482
query44	1295	798	791	791
query45	180	186	169	169
query46	863	1034	678	678
query47	1887	1918	1847	1847
query48	393	432	306	306
query49	680	500	430	430
query50	637	706	396	396
query51	4253	4298	4181	4181
query52	115	111	103	103
query53	227	259	188	188
query54	591	599	523	523
query55	92	87	88	87
query56	311	291	297	291
query57	1200	1216	1135	1135
query58	282	264	255	255
query59	2631	2844	2729	2729
query60	324	329	314	314
query61	166	131	126	126
query62	732	729	696	696
query63	230	191	193	191
query64	1825	1114	701	701
query65	4445	4403	4347	4347
query66	751	398	304	304
query67	16197	15684	15521	15521
query68	6841	900	513	513
query69	539	316	268	268
query70	1166	1114	1106	1106
query71	503	315	300	300
query72	6044	4740	4864	4740
query73	1525	658	344	344
query74	9009	8840	8870	8840
query75	3965	3197	2742	2742
query76	4223	1203	732	732
query77	668	382	289	289
query78	10083	10118	9272	9272
query79	2417	878	574	574
query80	632	500	428	428
query81	489	256	225	225
query82	457	132	97	97
query83	336	255	228	228
query84	290	106	83	83
query85	779	344	378	344
query86	379	299	276	276
query87	4316	4650	4249	4249
query88	3469	2198	2170	2170
query89	408	316	272	272
query90	1896	203	208	203
query91	144	143	113	113
query92	76	59	58	58
query93	1903	934	572	572
query94	658	416	299	299
query95	374	284	304	284
query96	485	575	274	274
query97	3190	3291	3084	3084
query98	231	206	202	202
query99	1444	1410	1305	1305
Total cold run time: 298919 ms
Total hot run time: 193223 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.54 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 63dacd1e6cc2b0901241e668562fe5bcfeec9f1b, data reload: false

query1	0.04	0.03	0.03
query2	0.12	0.11	0.11
query3	0.25	0.19	0.20
query4	1.60	0.19	0.20
query5	0.59	0.56	0.59
query6	1.17	0.73	0.72
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.57	0.52	0.52
query10	0.57	0.58	0.56
query11	0.16	0.11	0.11
query12	0.14	0.11	0.11
query13	0.61	0.59	0.60
query14	2.74	2.67	2.82
query15	0.91	0.85	0.85
query16	0.39	0.38	0.38
query17	1.04	1.02	1.06
query18	0.21	0.20	0.19
query19	1.85	1.92	1.83
query20	0.02	0.01	0.01
query21	15.35	0.89	0.53
query22	0.75	1.14	1.05
query23	14.68	1.35	0.60
query24	7.43	0.94	1.25
query25	0.48	0.21	0.08
query26	0.61	0.17	0.13
query27	0.05	0.05	0.05
query28	9.04	0.87	0.42
query29	12.74	3.97	3.27
query30	0.25	0.09	0.06
query31	2.83	0.58	0.39
query32	3.23	0.55	0.47
query33	3.04	3.06	3.06
query34	15.80	5.06	4.47
query35	4.50	4.51	4.55
query36	0.67	0.49	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.02
query40	0.17	0.14	0.14
query41	0.07	0.03	0.03
query42	0.04	0.02	0.03
query43	0.04	0.03	0.03
Total cold run time: 104.98 s
Total hot run time: 31.54 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/16) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.26% (14022/26831)
Line Coverage 41.04% (121037/294957)
Region Coverage 39.77% (61604/154882)
Branch Coverage 34.44% (30841/89544)

@kaka11chen kaka11chen marked this pull request as ready for review April 15, 2025 07:09
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 15, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit d55ece3 into apache:master Apr 18, 2025
26 of 29 checks passed
kaka11chen added a commit to kaka11chen/doris that referenced this pull request May 27, 2025
…with pushdown. (apache#49835)

Problem Summary:
The current orc pushdown and delayed materialization conditions are
connected together. The conditions that can be pushed down must be used
for delayed materialization conditions. This is unreasonable. The two
should be orthogonal.

- Fix orc lazy materialization should not be bundled with pushdown.
- Fix materialization for hive acid table.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…with pushdown. (apache#49835)

### What problem does this PR solve?

Problem Summary:
The current orc pushdown and delayed materialization conditions are
connected together. The conditions that can be pushed down must be used
for delayed materialization conditions. This is unreasonable. The two
should be orthogonal.

### Release note
- Fix orc lazy materialization should not be bundled with pushdown.
- Fix materialization for hive acid table.
suxiaogang223 pushed a commit to suxiaogang223/doris that referenced this pull request Jun 26, 2025
…with pushdown. (apache#49835)

Problem Summary:
The current orc pushdown and delayed materialization conditions are
connected together. The conditions that can be pushed down must be used
for delayed materialization conditions. This is unreasonable. The two
should be orthogonal.
- Fix orc lazy materialization should not be bundled with pushdown.
- Fix materialization for hive acid table.
suxiaogang223 pushed a commit to suxiaogang223/doris that referenced this pull request Jun 26, 2025
…with pushdown. (apache#49835)

Problem Summary:
The current orc pushdown and delayed materialization conditions are
connected together. The conditions that can be pushed down must be used
for delayed materialization conditions. This is unreasonable. The two
should be orthogonal.
- Fix orc lazy materialization should not be bundled with pushdown.
- Fix materialization for hive acid table.
morrySnow pushed a commit that referenced this pull request Jun 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/3.1.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants