Skip to content

Conversation

@hubgeter
Copy link
Contributor

@hubgeter hubgeter commented Jun 22, 2025

What problem does this PR solve?

Related PR: #51329

Problem Summary:
Topn lazy materialize was introduced in pr#51329 , but the implementation had performance issues when reading external tables. This pr is used for optimization.

  1. Before this, the materialization phase read one row of data from the file each time. This pr classifies according to scan_range and reads multiple rows of data from the file at one time.
  2. Before this, the materialization phase was a single-threaded file reading phase. This pr creates a scan task and submits the task to the workload group to improve the reading speed.
  3. Before this, the runtime profile was transmitted through thrift. This pr introduces the implementation of protobuf and adds the profile information of RowIDFetcher to MATERIALIZATION_OPERATOR.
    The example is as follows:
    1FE 2BE
    sql :select * from ali_hive.tpch100_orc.lineitem order by l_partkey limit 10;
MATERIALIZATION_OPERATOR  (id=3):(ExecTime:  2.645ms)
        -  BlocksProduced:  5
        -  CloseTime:  0ns
        -  ExecTime:  2.645ms
        -  InitTime:  0ns
        -  MemoryUsage:  0.00  
        -  MemoryUsagePeak:  0.00  
        -  OpenTime:  0ns
        -  ProjectionTime:  528.913us
        -  RowsProduced:  10
        -  WaitForDependency[MATERIALIZATION_COUNTER_DEPENDENCY]Time:  12sec874ms
    RowIDFetcher:  BackendId:1750838859134:
            -  FileReadBytes:  {[2.89  MB,  ],  [9.51  MB,  ],  [6.81  MB,  ],  [4.74  MB,  ],  [22.33  MB,  ],  }
            -  FileReadLines:  {[1,  ],  [1,  ],  [1,  ],  [1,  ],  [1,  ],  }
            -  FileReadTime:  {[102.960ms,],  [104.028ms,],  [99.817ms,],  [98.260ms,],  [120.129ms,],  }
            -  GetBlockAvgTime:  {14ms,  2ms,  2ms,  1ms,  3ms,  }
            -  InitReaderAvgTime:  {14ms,  2ms,  2ms,  1ms,  3ms,  }
            -  ScannersRunningTime:  {130ms,  124ms,  116ms,  113ms,  151ms,  }
    RowIDFetcher:  BackendId:1750936290862:
            -  FileReadBytes:  {[13.80  MB,  ],  [21.28  MB,  ],  [8.18  MB,  ],  [16.69  MB,  ],  [19.16  MB,  ],  }
            -  FileReadLines:  {[1,  ],  [1,  ],  [1,  ],  [1,  ],  [1,  ],  }
            -  FileReadTime:  {[113.031ms,],  [132.087ms,],  [105.361ms,],  [117.245ms,],  [125.535ms,],  }
            -  GetBlockAvgTime:  {2ms,  2ms,  2ms,  1ms,  3ms,  }
            -  InitReaderAvgTime:  {2ms,  2ms,  2ms,  1ms,  3ms,  }
            -  ScannersRunningTime:  {144ms,  160ms,  127ms,  142ms,  159ms,  }

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

run buildall

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35153 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ed22d5d68314f572ddb107abf4be272ab327ca08, data reload: false

------ Round 1 ----------------------------------
q1	17581	5250	5123	5123
q2	1990	331	190	190
q3	10255	1329	745	745
q4	10240	1033	534	534
q5	7539	2398	2384	2384
q6	180	164	134	134
q7	927	742	605	605
q8	9329	1310	1255	1255
q9	6838	5117	5151	5117
q10	6921	2384	1988	1988
q11	500	313	296	296
q12	345	365	226	226
q13	17765	3748	3091	3091
q14	242	236	215	215
q15	550	482	485	482
q16	433	424	373	373
q17	636	900	406	406
q18	7832	7292	7132	7132
q19	1369	955	578	578
q20	367	346	230	230
q21	4216	3389	3056	3056
q22	1064	1022	993	993
Total cold run time: 107119 ms
Total hot run time: 35153 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5258	5147	5249	5147
q2	248	331	232	232
q3	2154	2700	2340	2340
q4	1367	1816	1367	1367
q5	4388	4413	4428	4413
q6	210	174	129	129
q7	1944	1923	1780	1780
q8	2632	2683	2539	2539
q9	7199	7228	7180	7180
q10	3042	3241	2916	2916
q11	589	514	515	514
q12	683	770	619	619
q13	3601	3925	3395	3395
q14	281	294	284	284
q15	519	479	462	462
q16	429	467	434	434
q17	1123	1454	1361	1361
q18	7550	7180	7139	7139
q19	846	830	953	830
q20	1899	1973	1845	1845
q21	4889	4384	4313	4313
q22	1076	1064	994	994
Total cold run time: 51927 ms
Total hot run time: 50233 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186100 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ed22d5d68314f572ddb107abf4be272ab327ca08, data reload: false

query1	979	385	395	385
query2	6553	1835	1810	1810
query3	6745	224	220	220
query4	25912	23267	22990	22990
query5	4342	656	469	469
query6	322	219	214	214
query7	4626	501	291	291
query8	262	224	218	218
query9	8616	2619	2628	2619
query10	501	337	290	290
query11	15455	15114	14830	14830
query12	163	115	108	108
query13	1665	548	420	420
query14	9000	6203	6222	6203
query15	210	204	183	183
query16	7351	608	441	441
query17	1212	753	588	588
query18	1997	432	322	322
query19	206	195	167	167
query20	126	118	117	117
query21	218	133	112	112
query22	4006	4197	4074	4074
query23	34010	33114	33044	33044
query24	8451	2386	2390	2386
query25	541	488	422	422
query26	1240	267	150	150
query27	2745	498	339	339
query28	4308	2137	2124	2124
query29	745	557	441	441
query30	281	223	215	215
query31	920	849	749	749
query32	71	68	63	63
query33	555	368	315	315
query34	797	868	535	535
query35	781	819	726	726
query36	930	981	890	890
query37	115	100	80	80
query38	4050	4237	4084	4084
query39	1474	1427	1453	1427
query40	206	117	104	104
query41	63	59	58	58
query42	123	114	110	110
query43	492	502	470	470
query44	1305	831	831	831
query45	179	177	168	168
query46	881	1025	637	637
query47	1765	1778	1718	1718
query48	385	422	314	314
query49	754	497	406	406
query50	622	689	411	411
query51	4104	4101	4113	4101
query52	111	107	101	101
query53	226	266	194	194
query54	574	579	511	511
query55	87	85	89	85
query56	318	291	288	288
query57	1201	1201	1127	1127
query58	265	269	254	254
query59	2645	2720	2598	2598
query60	319	319	310	310
query61	130	127	125	125
query62	816	722	662	662
query63	222	189	191	189
query64	4384	1000	686	686
query65	4257	4187	4213	4187
query66	1153	419	325	325
query67	15687	15441	15293	15293
query68	8587	896	550	550
query69	475	300	286	286
query70	1191	1174	1136	1136
query71	475	331	290	290
query72	5327	4730	4734	4730
query73	708	605	354	354
query74	9208	9091	8985	8985
query75	3927	3203	2686	2686
query76	3640	1177	763	763
query77	781	366	289	289
query78	10321	10227	9360	9360
query79	2204	844	583	583
query80	590	526	440	440
query81	484	282	226	226
query82	490	133	107	107
query83	261	250	237	237
query84	255	104	88	88
query85	818	391	311	311
query86	388	306	303	303
query87	4454	4450	4468	4450
query88	3624	2324	2294	2294
query89	391	320	294	294
query90	1897	220	220	220
query91	144	144	115	115
query92	79	64	60	60
query93	1690	985	596	596
query94	628	417	306	306
query95	372	298	282	282
query96	503	575	284	284
query97	2710	2763	2663	2663
query98	240	213	200	200
query99	1332	1387	1288	1288
Total cold run time: 273949 ms
Total hot run time: 186100 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.16 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ed22d5d68314f572ddb107abf4be272ab327ca08, data reload: false

query1	0.03	0.03	0.02
query2	0.08	0.03	0.04
query3	0.24	0.07	0.06
query4	1.60	0.11	0.10
query5	0.44	0.42	0.42
query6	1.15	0.65	0.67
query7	0.02	0.02	0.01
query8	0.04	0.03	0.04
query9	0.58	0.51	0.52
query10	0.56	0.58	0.57
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.64	0.61	0.61
query14	0.80	0.81	0.80
query15	0.90	0.86	0.87
query16	0.37	0.38	0.40
query17	1.04	1.09	1.05
query18	0.23	0.21	0.21
query19	1.94	1.83	1.85
query20	0.02	0.02	0.01
query21	15.37	0.89	0.54
query22	0.77	1.18	0.64
query23	15.00	1.39	0.63
query24	7.09	0.86	0.56
query25	0.47	0.25	0.06
query26	0.54	0.17	0.14
query27	0.07	0.06	0.05
query28	9.89	0.92	0.45
query29	12.58	4.03	3.36
query30	0.26	0.10	0.06
query31	2.83	0.61	0.41
query32	3.24	0.56	0.47
query33	3.04	3.17	3.09
query34	16.11	5.39	4.72
query35	4.84	4.77	4.84
query36	0.69	0.53	0.49
query37	0.10	0.07	0.07
query38	0.05	0.04	0.04
query39	0.03	0.03	0.02
query40	0.18	0.15	0.14
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.03	0.02
Total cold run time: 104.29 s
Total hot run time: 29.16 s

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33815 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0a38597368c5347fa335644d05cd4c8d8f8c86cd, data reload: false

------ Round 1 ----------------------------------
q1	17593	5390	5041	5041
q2	1932	309	201	201
q3	10283	1433	721	721
q4	10216	1011	514	514
q5	7523	2374	2320	2320
q6	175	161	131	131
q7	910	727	613	613
q8	9313	1248	1153	1153
q9	6811	5104	5104	5104
q10	6881	2397	1944	1944
q11	484	295	285	285
q12	338	346	212	212
q13	17761	3663	3058	3058
q14	232	236	216	216
q15	565	492	492	492
q16	425	423	367	367
q17	605	889	369	369
q18	7670	7203	6997	6997
q19	2259	952	545	545
q20	340	332	225	225
q21	3731	3192	2353	2353
q22	1001	1005	954	954
Total cold run time: 107048 ms
Total hot run time: 33815 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5221	5068	5091	5068
q2	246	324	222	222
q3	2225	2667	2320	2320
q4	1360	1789	1354	1354
q5	4215	4428	4421	4421
q6	210	173	126	126
q7	2005	1936	1743	1743
q8	2583	2503	2533	2503
q9	7138	7147	7177	7147
q10	3050	3270	2862	2862
q11	587	514	487	487
q12	706	779	636	636
q13	3483	3788	3243	3243
q14	311	298	290	290
q15	543	469	474	469
q16	449	505	452	452
q17	1154	1544	1322	1322
q18	7342	7214	7149	7149
q19	771	747	812	747
q20	1890	2028	1807	1807
q21	4768	4401	4162	4162
q22	1041	1007	995	995
Total cold run time: 51298 ms
Total hot run time: 49525 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184906 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0a38597368c5347fa335644d05cd4c8d8f8c86cd, data reload: false

query1	977	392	404	392
query2	6524	1950	1895	1895
query3	6738	214	213	213
query4	26293	23491	23085	23085
query5	4307	613	458	458
query6	313	207	206	206
query7	4623	513	296	296
query8	284	235	221	221
query9	8669	2623	2635	2623
query10	436	315	268	268
query11	15674	15147	14926	14926
query12	160	115	111	111
query13	1660	546	417	417
query14	9315	5962	5895	5895
query15	216	206	178	178
query16	7701	641	469	469
query17	1203	725	591	591
query18	2036	413	312	312
query19	209	192	162	162
query20	124	119	120	119
query21	217	129	111	111
query22	4146	4159	4056	4056
query23	33742	32837	32917	32837
query24	8420	2361	2391	2361
query25	507	456	410	410
query26	822	261	151	151
query27	2709	507	341	341
query28	4279	2136	2122	2122
query29	670	535	425	425
query30	279	220	203	203
query31	951	795	777	777
query32	77	63	63	63
query33	555	403	344	344
query34	774	847	524	524
query35	774	803	736	736
query36	948	956	880	880
query37	109	103	76	76
query38	4115	4117	3982	3982
query39	1475	1417	1402	1402
query40	214	118	115	115
query41	66	59	58	58
query42	130	113	107	107
query43	510	511	479	479
query44	1305	825	810	810
query45	179	175	163	163
query46	833	1009	622	622
query47	1708	1777	1710	1710
query48	382	412	329	329
query49	691	479	413	413
query50	635	702	401	401
query51	4124	4221	4082	4082
query52	114	114	106	106
query53	235	249	191	191
query54	568	569	503	503
query55	86	93	86	86
query56	315	291	300	291
query57	1196	1185	1118	1118
query58	266	262	258	258
query59	2714	2747	2728	2728
query60	341	314	297	297
query61	126	126	127	126
query62	804	719	662	662
query63	222	183	191	183
query64	3458	1051	679	679
query65	4253	4156	4176	4156
query66	882	468	327	327
query67	16037	15609	15149	15149
query68	8688	880	520	520
query69	470	299	269	269
query70	1194	1139	1089	1089
query71	463	330	294	294
query72	5544	4788	4793	4788
query73	721	632	352	352
query74	9268	9114	8682	8682
query75	4055	3206	2693	2693
query76	3593	1185	748	748
query77	786	433	297	297
query78	10177	10165	9323	9323
query79	2115	808	574	574
query80	598	504	445	445
query81	486	265	229	229
query82	458	128	101	101
query83	256	259	230	230
query84	251	98	91	91
query85	804	361	323	323
query86	384	302	281	281
query87	4476	4371	4325	4325
query88	3420	2275	2266	2266
query89	378	332	287	287
query90	1851	210	206	206
query91	141	139	111	111
query92	74	105	58	58
query93	1608	950	578	578
query94	662	412	312	312
query95	368	299	282	282
query96	483	573	276	276
query97	2741	2765	2630	2630
query98	238	210	204	204
query99	1423	1372	1255	1255
Total cold run time: 273599 ms
Total hot run time: 184906 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.38 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0a38597368c5347fa335644d05cd4c8d8f8c86cd, data reload: false

query1	0.04	0.04	0.02
query2	0.08	0.03	0.04
query3	0.23	0.07	0.06
query4	1.61	0.10	0.11
query5	0.43	0.43	0.41
query6	1.18	0.67	0.67
query7	0.03	0.02	0.02
query8	0.05	0.03	0.04
query9	0.57	0.52	0.51
query10	0.57	0.57	0.56
query11	0.16	0.11	0.12
query12	0.14	0.11	0.11
query13	0.63	0.61	0.60
query14	0.81	0.81	0.83
query15	0.92	0.87	0.89
query16	0.38	0.38	0.39
query17	1.08	1.09	1.09
query18	0.22	0.22	0.21
query19	1.94	1.88	1.94
query20	0.02	0.01	0.01
query21	15.42	0.94	0.58
query22	0.76	1.18	0.66
query23	14.97	1.41	0.59
query24	7.24	1.28	0.50
query25	0.52	0.17	0.20
query26	0.64	0.17	0.13
query27	0.07	0.06	0.05
query28	9.06	0.93	0.47
query29	12.61	4.04	3.37
query30	0.25	0.11	0.08
query31	2.84	0.61	0.40
query32	3.22	0.54	0.47
query33	3.11	3.05	3.04
query34	16.12	5.39	4.74
query35	4.89	4.82	4.86
query36	0.70	0.51	0.51
query37	0.08	0.06	0.06
query38	0.06	0.04	0.03
query39	0.03	0.02	0.03
query40	0.18	0.14	0.14
query41	0.09	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 104.02 s
Total hot run time: 29.38 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 6.29% (9/143) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.04% (15361/26930)
Line Coverage 46.12% (139417/302295)
Region Coverage 45.47% (70652/155377)
Branch Coverage 40.23% (37320/92770)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 11.19% (16/143) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 69.53% (18419/26492)
Line Coverage 61.55% (185937/302107)
Region Coverage 60.14% (109657/182341)
Branch Coverage 53.75% (56740/105554)

@hubgeter
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.52% (1171/1402)
Line Coverage 67.47% (20646/30601)
Region Coverage 67.03% (10252/15295)
Branch Coverage 56.31% (5365/9528)

@doris-robot
Copy link

TPC-H: Total hot run time: 33885 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4fec7f0b8ef82de925838b01ce2d8ca3aa810919, data reload: false

------ Round 1 ----------------------------------
q1	17589	5208	5080	5080
q2	1938	289	188	188
q3	10342	1315	714	714
q4	10237	1049	515	515
q5	7605	2440	2351	2351
q6	185	163	128	128
q7	906	746	598	598
q8	9305	1292	1013	1013
q9	7397	5127	5166	5127
q10	6874	2397	1938	1938
q11	504	300	291	291
q12	334	353	217	217
q13	17780	3697	3038	3038
q14	236	225	211	211
q15	541	475	489	475
q16	426	446	380	380
q17	585	859	376	376
q18	7591	7176	7212	7176
q19	1840	964	546	546
q20	321	340	227	227
q21	3777	2548	2318	2318
q22	1054	1032	978	978
Total cold run time: 107367 ms
Total hot run time: 33885 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5205	5110	5118	5110
q2	243	320	221	221
q3	2160	2699	2302	2302
q4	1370	1812	1360	1360
q5	4205	4395	4492	4395
q6	228	169	124	124
q7	2013	1912	1732	1732
q8	2639	2505	2728	2505
q9	7088	7168	7302	7168
q10	3124	3230	2850	2850
q11	582	503	481	481
q12	670	803	613	613
q13	3596	3921	3278	3278
q14	289	297	264	264
q15	520	481	472	472
q16	435	476	428	428
q17	1146	1532	1334	1334
q18	7392	7065	7065	7065
q19	765	748	806	748
q20	1922	1965	1797	1797
q21	4848	4479	4380	4380
q22	1077	1023	1003	1003
Total cold run time: 51517 ms
Total hot run time: 49630 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184507 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4fec7f0b8ef82de925838b01ce2d8ca3aa810919, data reload: false

query1	991	385	386	385
query2	6549	1699	1730	1699
query3	6738	218	213	213
query4	26475	23628	23236	23236
query5	4340	575	433	433
query6	322	247	195	195
query7	4618	504	291	291
query8	284	224	213	213
query9	8637	2619	2589	2589
query10	450	315	259	259
query11	15909	15206	14732	14732
query12	149	103	101	101
query13	1621	504	396	396
query14	9294	5667	5638	5638
query15	210	197	176	176
query16	7429	613	463	463
query17	1191	704	567	567
query18	1984	420	321	321
query19	212	192	186	186
query20	123	120	118	118
query21	209	121	106	106
query22	4118	4078	4011	4011
query23	34166	33161	33134	33134
query24	8432	2380	2418	2380
query25	562	525	423	423
query26	725	276	152	152
query27	2672	502	348	348
query28	4331	2130	2116	2116
query29	706	607	463	463
query30	283	219	192	192
query31	920	852	787	787
query32	71	68	68	68
query33	568	379	387	379
query34	781	832	513	513
query35	763	828	723	723
query36	932	952	858	858
query37	107	102	77	77
query38	4132	4176	4137	4137
query39	1491	1415	1445	1415
query40	202	117	101	101
query41	55	55	53	53
query42	127	104	105	104
query43	493	499	477	477
query44	1303	818	870	818
query45	174	168	164	164
query46	828	1013	614	614
query47	1738	1784	1710	1710
query48	397	409	313	313
query49	698	479	386	386
query50	653	678	430	430
query51	4163	4138	4096	4096
query52	108	103	96	96
query53	222	247	184	184
query54	577	575	502	502
query55	82	78	82	78
query56	307	295	287	287
query57	1182	1192	1095	1095
query58	257	251	251	251
query59	2614	2710	2539	2539
query60	323	326	301	301
query61	123	120	136	120
query62	803	702	651	651
query63	218	187	185	185
query64	3108	1019	662	662
query65	4259	4180	4178	4178
query66	1000	415	348	348
query67	15670	15726	15291	15291
query68	8130	897	521	521
query69	477	303	270	270
query70	1185	1071	1111	1071
query71	480	330	291	291
query72	5857	4721	4822	4721
query73	710	631	353	353
query74	9249	9228	8678	8678
query75	3843	3189	2649	2649
query76	3603	1147	703	703
query77	778	384	290	290
query78	10050	10385	9279	9279
query79	2056	881	592	592
query80	809	518	428	428
query81	488	257	217	217
query82	435	121	90	90
query83	280	247	244	244
query84	297	102	83	83
query85	784	353	318	318
query86	335	297	277	277
query87	4566	4464	4347	4347
query88	3379	2322	2258	2258
query89	375	319	288	288
query90	1926	203	205	203
query91	141	136	115	115
query92	74	60	56	56
query93	1473	953	584	584
query94	671	423	299	299
query95	376	297	291	291
query96	488	559	282	282
query97	2678	2708	2663	2663
query98	249	203	204	203
query99	1401	1407	1316	1316
Total cold run time: 272822 ms
Total hot run time: 184507 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.09 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4fec7f0b8ef82de925838b01ce2d8ca3aa810919, data reload: false

query1	0.04	0.04	0.03
query2	0.11	0.06	0.05
query3	0.28	0.06	0.08
query4	1.60	0.09	0.08
query5	0.43	0.41	0.41
query6	1.17	0.65	0.65
query7	0.02	0.01	0.02
query8	0.06	0.05	0.06
query9	0.63	0.52	0.51
query10	0.58	0.57	0.58
query11	0.25	0.12	0.13
query12	0.26	0.14	0.14
query13	0.65	0.63	0.64
query14	0.81	0.84	0.81
query15	0.97	0.88	0.89
query16	0.38	0.38	0.38
query17	1.09	1.07	1.10
query18	0.25	0.23	0.24
query19	1.97	1.92	1.86
query20	0.02	0.02	0.02
query21	15.39	0.96	0.67
query22	0.92	0.97	0.78
query23	14.73	1.55	0.76
query24	5.40	0.55	0.28
query25	0.17	0.09	0.09
query26	0.56	0.21	0.18
query27	0.09	0.09	0.09
query28	11.01	1.20	0.58
query29	12.54	4.09	3.41
query30	0.28	0.09	0.07
query31	2.84	0.63	0.43
query32	3.24	0.59	0.50
query33	3.19	3.09	3.14
query34	16.66	5.40	4.69
query35	4.75	4.76	4.82
query36	0.64	0.52	0.49
query37	0.20	0.19	0.17
query38	0.17	0.17	0.15
query39	0.05	0.05	0.04
query40	0.20	0.17	0.18
query41	0.10	0.04	0.05
query42	0.06	0.06	0.05
query43	0.05	0.04	0.05
Total cold run time: 104.81 s
Total hot run time: 30.09 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 3.45% (20/580) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.07% (15360/26915)
Line Coverage 46.11% (139615/302762)
Region Coverage 45.49% (70794/155618)
Branch Coverage 40.21% (37360/92902)

@hubgeter
Copy link
Contributor Author

hubgeter commented Jul 1, 2025

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.39% (1185/1421)
Line Coverage 67.44% (20705/30702)
Region Coverage 67.05% (10283/15337)
Branch Coverage 56.37% (5378/9540)

@doris-robot
Copy link

TPC-H: Total hot run time: 34244 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7defef8185e3e7f14ea5de03c7d56ff7826e799f, data reload: false

------ Round 1 ----------------------------------
q1	17586	5187	5116	5116
q2	1929	282	184	184
q3	10372	1309	734	734
q4	10233	1031	533	533
q5	7547	2340	2440	2340
q6	182	169	133	133
q7	922	757	611	611
q8	9331	1307	1156	1156
q9	6861	5184	5128	5128
q10	6904	2393	1982	1982
q11	479	300	273	273
q12	343	354	228	228
q13	17791	3762	3106	3106
q14	227	227	210	210
q15	557	475	475	475
q16	435	424	395	395
q17	608	888	371	371
q18	7973	7245	7157	7157
q19	1639	982	569	569
q20	321	358	212	212
q21	3808	3193	2375	2375
q22	1007	990	956	956
Total cold run time: 107055 ms
Total hot run time: 34244 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5210	5164	5141	5141
q2	247	331	227	227
q3	2165	2667	2316	2316
q4	1355	1783	1382	1382
q5	4324	4588	4565	4565
q6	217	165	127	127
q7	2093	2016	1838	1838
q8	2692	2680	2608	2608
q9	7281	7240	7238	7238
q10	3184	3298	2941	2941
q11	600	520	495	495
q12	723	769	622	622
q13	3662	4054	3382	3382
q14	285	319	286	286
q15	537	495	476	476
q16	456	494	443	443
q17	1191	1546	1439	1439
q18	8016	7267	7163	7163
q19	843	826	914	826
q20	1904	1979	1813	1813
q21	4792	4391	4325	4325
q22	1055	1042	990	990
Total cold run time: 52832 ms
Total hot run time: 50643 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184964 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7defef8185e3e7f14ea5de03c7d56ff7826e799f, data reload: false

query1	1042	389	397	389
query2	6527	1801	1849	1801
query3	6739	213	214	213
query4	26463	23751	23616	23616
query5	4325	571	437	437
query6	290	202	188	188
query7	4614	491	278	278
query8	264	213	207	207
query9	8577	2617	2613	2613
query10	488	336	284	284
query11	15262	15089	14767	14767
query12	149	109	111	109
query13	1657	519	405	405
query14	8594	5831	5764	5764
query15	206	191	178	178
query16	7157	633	466	466
query17	1178	734	603	603
query18	1986	407	312	312
query19	201	197	161	161
query20	124	117	115	115
query21	214	121	106	106
query22	4096	4137	3900	3900
query23	34111	33009	32939	32939
query24	8422	2390	2421	2390
query25	542	494	395	395
query26	1246	262	148	148
query27	2772	498	343	343
query28	4322	2135	2108	2108
query29	765	554	431	431
query30	282	220	188	188
query31	938	870	738	738
query32	70	60	58	58
query33	555	382	300	300
query34	797	833	521	521
query35	783	798	769	769
query36	929	987	872	872
query37	107	94	73	73
query38	4189	4064	4083	4064
query39	1470	1395	1405	1395
query40	215	120	105	105
query41	55	57	50	50
query42	124	104	103	103
query43	519	509	491	491
query44	1349	837	814	814
query45	177	170	167	167
query46	861	1006	623	623
query47	1737	1793	1699	1699
query48	393	428	309	309
query49	730	474	399	399
query50	638	688	428	428
query51	4140	4166	4188	4166
query52	110	109	96	96
query53	239	255	177	177
query54	573	563	497	497
query55	83	85	81	81
query56	296	285	282	282
query57	1155	1183	1109	1109
query58	262	263	265	263
query59	2628	2749	2642	2642
query60	328	327	300	300
query61	127	123	121	121
query62	796	726	646	646
query63	224	188	190	188
query64	4328	1092	689	689
query65	4264	4154	4212	4154
query66	1140	407	316	316
query67	16026	15420	15344	15344
query68	8278	891	525	525
query69	479	309	263	263
query70	1202	1092	1118	1092
query71	475	316	300	300
query72	5815	4823	4840	4823
query73	800	638	351	351
query74	9003	8988	8666	8666
query75	3889	3152	2720	2720
query76	3754	1146	742	742
query77	787	378	287	287
query78	10094	10101	9275	9275
query79	2749	782	572	572
query80	619	495	439	439
query81	485	252	222	222
query82	477	120	93	93
query83	284	252	232	232
query84	301	107	81	81
query85	805	417	314	314
query86	387	319	285	285
query87	4401	4337	4273	4273
query88	3363	2284	2298	2284
query89	391	313	286	286
query90	1819	205	204	204
query91	138	137	106	106
query92	67	58	56	56
query93	1873	931	571	571
query94	664	396	292	292
query95	371	278	296	278
query96	495	579	278	278
query97	2736	2735	2653	2653
query98	230	206	219	206
query99	1475	1388	1292	1292
Total cold run time: 274676 ms
Total hot run time: 184964 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.91 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7defef8185e3e7f14ea5de03c7d56ff7826e799f, data reload: false

query1	0.05	0.04	0.04
query2	0.07	0.04	0.04
query3	0.24	0.08	0.07
query4	1.62	0.11	0.11
query5	0.44	0.43	0.43
query6	1.16	0.65	0.66
query7	0.02	0.01	0.02
query8	0.05	0.03	0.04
query9	0.60	0.51	0.52
query10	0.56	0.56	0.57
query11	0.15	0.10	0.11
query12	0.16	0.12	0.12
query13	0.64	0.61	0.61
query14	0.81	0.82	0.80
query15	0.91	0.87	0.87
query16	0.38	0.39	0.41
query17	1.07	1.10	1.05
query18	0.23	0.22	0.22
query19	2.00	1.81	1.84
query20	0.01	0.02	0.01
query21	15.45	0.91	0.55
query22	0.77	1.37	0.82
query23	14.72	1.39	0.67
query24	6.50	1.70	1.07
query25	0.52	0.22	0.11
query26	0.55	0.17	0.14
query27	0.06	0.06	0.05
query28	9.61	0.84	0.45
query29	12.54	3.95	3.25
query30	0.25	0.09	0.06
query31	2.85	0.61	0.40
query32	3.24	0.56	0.47
query33	3.05	3.10	3.10
query34	16.14	5.40	4.70
query35	4.78	4.84	4.84
query36	0.70	0.50	0.48
query37	0.09	0.07	0.06
query38	0.06	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.15	0.14
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.39 s
Total hot run time: 29.91 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 3.45% (20/580) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.08% (15370/26929)
Line Coverage 46.11% (139600/302780)
Region Coverage 45.47% (70779/155647)
Branch Coverage 40.18% (37338/92918)

morningman
morningman previously approved these changes Jul 4, 2025
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jul 4, 2025

PR approved by at least one committer and no changes requested.

}

{
stringstream ss;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not use stringstream use fmt to get better performance

RETURN_IF_ERROR(vfile_scanner_ptr->read_one_line_from_range(
scan_range_desc, request_block_desc.row_id(j), &result_block, external_info,
init_reader_ms, get_block_ms));
dst_col->insert_range_from(src_col, block_idx, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length 1, can use insert_from to do the work。
if want speed up the call, also can call insert_from_multi_column

.request = multi_get_request,
.callback = nullptr,
.rpc_timer = MonotonicStopWatch()});
// if (profile != nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if unless should delete

return Status::OK();
}

Status MaterializationSharedState::_update_profile_info(int64_t backend_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the func only return OK, should be void

}
auto& info_map = backend_profile_info_string[backend_id];

auto func = [&](const std::string& info_key) -> Status {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a meaningful name

child.to_proto(proto_counters, child_counter_map);
}

// Recurse into children
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if unless,should detele

}

PProfileCounter to_proto_peak(const std::string& name) const {
PProfileCounter counter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same code like to_proto just call to_proto

@hubgeter
Copy link
Contributor Author

hubgeter commented Jul 7, 2025

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Jul 7, 2025
@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.02% (1222/1472)
Line Coverage 67.54% (21178/31355)
Region Coverage 67.24% (10538/15673)
Branch Coverage 56.60% (5546/9798)

@hubgeter
Copy link
Contributor Author

hubgeter commented Jul 8, 2025

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 83.03% (1223/1473)
Line Coverage 67.66% (21211/31351)
Region Coverage 67.41% (10571/15682)
Branch Coverage 56.72% (5562/9806)

@doris-robot
Copy link

TPC-H: Total hot run time: 32929 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit aa125c386d62750d83a4bd0b34f62c8207900132, data reload: false

------ Round 1 ----------------------------------
q1	17582	5185	5000	5000
q2	1942	282	192	192
q3	10348	1314	717	717
q4	10251	1052	520	520
q5	7771	2388	2375	2375
q6	178	164	129	129
q7	891	742	599	599
q8	9299	1310	1039	1039
q9	7232	5137	5037	5037
q10	6897	2385	1963	1963
q11	492	300	277	277
q12	348	348	221	221
q13	17762	3651	3033	3033
q14	229	232	225	225
q15	559	489	486	486
q16	430	420	374	374
q17	566	883	349	349
q18	7855	7234	7030	7030
q19	1220	967	539	539
q20	326	341	214	214
q21	3768	3176	2317	2317
q22	369	327	293	293
Total cold run time: 106315 ms
Total hot run time: 32929 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5064	5080	5087	5080
q2	234	323	220	220
q3	2181	2694	2293	2293
q4	1384	1747	1348	1348
q5	4262	4360	4684	4360
q6	212	167	123	123
q7	2051	1963	1886	1886
q8	2629	2642	2623	2623
q9	7301	7237	7212	7212
q10	3117	3289	2882	2882
q11	581	519	496	496
q12	736	803	629	629
q13	3652	4015	3474	3474
q14	300	296	282	282
q15	522	477	475	475
q16	440	474	440	440
q17	1183	1617	1457	1457
q18	7972	7597	7549	7549
q19	818	930	1096	930
q20	2025	2066	1995	1995
q21	5055	4666	4707	4666
q22	634	596	558	558
Total cold run time: 52353 ms
Total hot run time: 50978 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186311 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit aa125c386d62750d83a4bd0b34f62c8207900132, data reload: false

query1	1012	401	381	381
query2	6539	1765	1761	1761
query3	6747	211	218	211
query4	26307	23886	23420	23420
query5	4369	604	436	436
query6	298	208	202	202
query7	4640	513	287	287
query8	278	222	209	209
query9	8614	2610	2628	2610
query10	445	335	267	267
query11	15371	15044	14778	14778
query12	147	103	108	103
query13	1643	516	388	388
query14	8398	5902	5846	5846
query15	207	182	168	168
query16	7117	427	249	249
query17	934	751	585	585
query18	1984	406	305	305
query19	191	213	158	158
query20	127	120	114	114
query21	210	122	107	107
query22	4114	4071	4338	4071
query23	35201	33894	33807	33807
query24	8490	2312	2349	2312
query25	526	457	374	374
query26	1228	261	144	144
query27	2782	492	338	338
query28	4342	2116	2087	2087
query29	746	530	420	420
query30	288	220	188	188
query31	901	839	830	830
query32	71	63	64	63
query33	541	335	282	282
query34	792	840	514	514
query35	578	654	562	562
query36	993	981	846	846
query37	103	121	70	70
query38	4142	4192	4012	4012
query39	1444	1423	1450	1423
query40	207	113	100	100
query41	53	53	49	49
query42	120	105	106	105
query43	491	527	479	479
query44	1356	812	837	812
query45	178	166	160	160
query46	834	1027	613	613
query47	1737	1768	1692	1692
query48	377	425	305	305
query49	733	466	383	383
query50	630	699	424	424
query51	4165	4200	4096	4096
query52	103	107	92	92
query53	214	251	182	182
query54	576	557	489	489
query55	81	88	81	81
query56	290	301	290	290
query57	1178	1192	1103	1103
query58	259	247	248	247
query59	2593	2672	2624	2624
query60	338	313	307	307
query61	124	116	119	116
query62	797	737	647	647
query63	222	181	176	176
query64	4405	1153	929	929
query65	4308	4230	4191	4191
query66	1132	396	301	301
query67	15944	15494	15393	15393
query68	8118	907	536	536
query69	491	321	260	260
query70	1186	1099	1112	1099
query71	475	320	308	308
query72	5610	4797	4937	4797
query73	717	682	417	417
query74	9124	9019	8856	8856
query75	3893	3187	2643	2643
query76	3617	1151	775	775
query77	768	387	291	291
query78	10882	11191	10197	10197
query79	1883	827	561	561
query80	593	489	431	431
query81	482	259	222	222
query82	186	123	97	97
query83	251	246	232	232
query84	279	104	80	80
query85	739	356	307	307
query86	376	275	277	275
query87	4396	4361	4243	4243
query88	3517	2242	2288	2242
query89	369	311	288	288
query90	1993	206	199	199
query91	137	139	109	109
query92	82	57	53	53
query93	1812	938	590	590
query94	629	311	195	195
query95	368	283	279	279
query96	500	572	272	272
query97	2664	2789	2667	2667
query98	224	207	202	202
query99	1293	1419	1274	1274
Total cold run time: 273959 ms
Total hot run time: 186311 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.31 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit aa125c386d62750d83a4bd0b34f62c8207900132, data reload: false

query1	0.04	0.04	0.04
query2	0.07	0.04	0.04
query3	0.24	0.08	0.07
query4	1.63	0.11	0.12
query5	0.42	0.44	0.41
query6	1.15	0.65	0.67
query7	0.03	0.02	0.02
query8	0.05	0.04	0.03
query9	0.61	0.50	0.51
query10	0.57	0.56	0.56
query11	0.16	0.10	0.11
query12	0.15	0.11	0.11
query13	0.63	0.62	0.62
query14	0.81	0.81	0.82
query15	0.90	0.88	0.89
query16	0.38	0.39	0.40
query17	1.05	1.07	1.07
query18	0.22	0.22	0.22
query19	1.92	1.91	1.82
query20	0.02	0.01	0.01
query21	15.42	0.93	0.55
query22	0.75	1.16	0.72
query23	14.97	1.36	0.62
query24	6.75	1.80	0.35
query25	0.33	0.21	0.14
query26	0.67	0.17	0.15
query27	0.06	0.06	0.05
query28	9.54	0.95	0.45
query29	12.57	3.97	3.31
query30	0.25	0.08	0.07
query31	2.84	0.59	0.40
query32	3.23	0.55	0.47
query33	3.12	3.10	3.23
query34	16.06	5.39	4.80
query35	4.84	4.87	4.87
query36	0.72	0.51	0.48
query37	0.10	0.07	0.06
query38	0.05	0.05	0.04
query39	0.03	0.03	0.02
query40	0.17	0.14	0.13
query41	0.08	0.02	0.03
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.67 s
Total hot run time: 29.31 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 31.66% (177/559) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.38% (15528/27064)
Line Coverage 46.34% (141253/304825)
Region Coverage 45.64% (71422/156505)
Branch Coverage 40.33% (37638/93330)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 12, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman merged commit 6b552f7 into apache:master Jul 14, 2025
25 of 27 checks passed
morningman pushed a commit that referenced this pull request Sep 12, 2025
…external tables (#55865)

### What problem does this PR solve?

Related PR: #52114

Problem Summary:
PR #52114 Ignoring lazy materialization of external table, a line from a
file may be needed multiple times in a batch.
This PR fixes this issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.0-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants