Skip to content

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Jul 12, 2025

…pache#50882)

Issue Number:apache#50238

Problem Summary:

Previously, we refactored the code of the fileFormat attribute (apache#50225).
However, we only added the relevant code without modifying the business
code. This pull request modifies the code of the BrokerLoad feature that
is related to the fileformat.
@morningman morningman requested a review from morrySnow as a code owner July 12, 2025 04:32
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40123 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e4ab5e09ca8aba2881df11162f7a1f65d63cd904, data reload: false

------ Round 1 ----------------------------------
q1	17635	7005	6687	6687
q2	2070	185	157	157
q3	10685	1159	1173	1159
q4	10522	755	756	755
q5	7739	2891	2831	2831
q6	218	141	138	138
q7	994	636	622	622
q8	9365	2034	2041	2034
q9	6646	6370	6435	6370
q10	7014	2262	2320	2262
q11	451	268	270	268
q12	403	213	210	210
q13	17783	3003	3000	3000
q14	253	209	205	205
q15	504	461	472	461
q16	489	390	378	378
q17	976	603	528	528
q18	7289	6597	6627	6597
q19	1324	989	1013	989
q20	495	212	216	212
q21	3889	3291	3357	3291
q22	1083	1019	969	969
Total cold run time: 107827 ms
Total hot run time: 40123 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6642	6600	6675	6600
q2	327	236	233	233
q3	2962	3025	2965	2965
q4	2053	1872	1934	1872
q5	5715	5733	5745	5733
q6	213	132	134	132
q7	2233	1825	1809	1809
q8	3357	3515	3580	3515
q9	8776	8962	8865	8865
q10	3578	3524	3543	3524
q11	603	514	507	507
q12	774	599	624	599
q13	6218	3152	3206	3152
q14	300	272	280	272
q15	518	461	471	461
q16	502	438	449	438
q17	1865	1617	1621	1617
q18	8168	7796	7756	7756
q19	1695	1710	1523	1523
q20	2137	1841	1840	1840
q21	5202	5129	5007	5007
q22	1125	1029	1009	1009
Total cold run time: 64963 ms
Total hot run time: 59429 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197754 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e4ab5e09ca8aba2881df11162f7a1f65d63cd904, data reload: false

query1	1282	919	896	896
query2	6214	1952	1924	1924
query3	10850	4493	4311	4311
query4	33274	23479	24055	23479
query5	4373	471	444	444
query6	268	172	177	172
query7	3997	319	327	319
query8	293	254	232	232
query9	9582	2604	2605	2604
query10	485	276	261	261
query11	17920	15179	15072	15072
query12	158	104	105	104
query13	1559	428	440	428
query14	10302	7747	7384	7384
query15	256	186	192	186
query16	8123	465	499	465
query17	1715	617	632	617
query18	2219	329	318	318
query19	370	169	170	169
query20	128	129	113	113
query21	203	110	111	110
query22	4733	4557	4362	4362
query23	34862	34489	34173	34173
query24	12078	2934	2986	2934
query25	700	428	430	428
query26	1706	177	175	175
query27	2954	362	375	362
query28	7884	2188	2201	2188
query29	1056	487	487	487
query30	261	167	164	164
query31	1023	834	902	834
query32	104	52	56	52
query33	777	304	297	297
query34	1040	516	538	516
query35	839	753	741	741
query36	1147	945	961	945
query37	268	65	69	65
query38	4086	3983	4035	3983
query39	1571	1456	1473	1456
query40	261	101	102	101
query41	48	48	47	47
query42	125	100	106	100
query43	519	502	480	480
query44	1311	833	822	822
query45	185	171	177	171
query46	1158	774	740	740
query47	2028	1900	1916	1900
query48	452	345	353	345
query49	1003	389	400	389
query50	881	438	443	438
query51	7468	7279	7364	7279
query52	107	94	95	94
query53	274	183	186	183
query54	1283	477	481	477
query55	80	79	81	79
query56	276	259	242	242
query57	1330	1187	1215	1187
query58	239	236	209	209
query59	3269	3046	3099	3046
query60	294	261	260	260
query61	132	109	108	108
query62	854	723	715	715
query63	219	198	205	198
query64	4998	680	660	660
query65	3392	3258	3296	3258
query66	1344	326	308	308
query67	16211	15683	15533	15533
query68	5026	588	591	588
query69	449	267	291	267
query70	1193	1124	1159	1124
query71	334	267	251	251
query72	6181	4033	4005	4005
query73	764	346	353	346
query74	10688	8994	9252	8994
query75	3413	2653	2699	2653
query76	2814	1127	1054	1054
query77	392	288	281	281
query78	10573	9719	9556	9556
query79	2390	625	619	619
query80	1113	435	419	419
query81	558	224	225	224
query82	893	89	96	89
query83	232	147	144	144
query84	245	85	84	84
query85	1368	301	299	299
query86	431	284	291	284
query87	4430	4203	4199	4199
query88	4444	2400	2386	2386
query89	425	301	299	299
query90	2065	194	193	193
query91	141	109	109	109
query92	72	52	56	52
query93	2307	574	575	574
query94	911	299	285	285
query95	367	266	261	261
query96	658	292	281	281
query97	3298	3236	3194	3194
query98	226	202	200	200
query99	1515	1301	1312	1301
Total cold run time: 310317 ms
Total hot run time: 197754 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.45 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e4ab5e09ca8aba2881df11162f7a1f65d63cd904, data reload: false

query1	0.04	0.03	0.02
query2	0.06	0.03	0.03
query3	0.24	0.07	0.07
query4	1.63	0.10	0.11
query5	0.50	0.51	0.51
query6	1.14	0.73	0.73
query7	0.02	0.02	0.01
query8	0.03	0.03	0.03
query9	0.58	0.49	0.52
query10	0.55	0.55	0.55
query11	0.14	0.10	0.10
query12	0.14	0.11	0.12
query13	0.62	0.60	0.59
query14	0.78	0.78	0.81
query15	0.85	0.83	0.84
query16	0.40	0.38	0.40
query17	1.05	1.07	1.01
query18	0.22	0.21	0.22
query19	1.98	1.87	1.83
query20	0.02	0.01	0.00
query21	15.39	0.60	0.58
query22	2.61	1.86	1.44
query23	16.99	0.90	0.72
query24	3.35	1.04	0.81
query25	0.43	0.14	0.04
query26	0.30	0.13	0.14
query27	0.05	0.04	0.05
query28	10.54	0.48	0.49
query29	12.56	3.23	3.18
query30	0.24	0.06	0.06
query31	2.86	0.40	0.39
query32	3.23	0.45	0.46
query33	2.97	3.00	3.05
query34	16.94	4.48	4.48
query35	4.58	4.54	4.50
query36	0.64	0.47	0.48
query37	0.09	0.05	0.07
query38	0.04	0.03	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.03	0.03
query42	0.03	0.02	0.02
query43	0.04	0.03	0.02
Total cold run time: 105.15 s
Total hot run time: 29.45 s

@morningman
Copy link
Contributor Author

run buildall

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39378 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c4724abb0a738288e041224dc080e33f70cf6fec, data reload: false

------ Round 1 ----------------------------------
q1	17579	6817	6606	6606
q2	2058	195	182	182
q3	10641	1102	1120	1102
q4	10480	728	690	690
q5	7765	2797	2825	2797
q6	216	139	136	136
q7	961	616	609	609
q8	9362	1932	2017	1932
q9	6655	6400	6370	6370
q10	7043	2233	2267	2233
q11	460	251	255	251
q12	387	208	209	208
q13	17787	2949	2990	2949
q14	238	206	207	206
q15	509	474	483	474
q16	493	375	368	368
q17	963	545	561	545
q18	7093	6599	6703	6599
q19	1309	991	1050	991
q20	480	199	202	199
q21	3870	3062	2963	2963
q22	1084	990	968	968
Total cold run time: 107433 ms
Total hot run time: 39378 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6534	6538	6531	6531
q2	319	226	232	226
q3	2892	2802	2859	2802
q4	1997	1766	1795	1766
q5	5711	5691	5655	5655
q6	211	131	129	129
q7	2190	1824	1760	1760
q8	3371	3497	3458	3458
q9	8872	8779	8928	8779
q10	3524	3531	3522	3522
q11	603	502	488	488
q12	805	580	602	580
q13	6739	3212	3304	3212
q14	317	279	300	279
q15	514	459	473	459
q16	490	461	435	435
q17	1841	1643	1612	1612
q18	8568	7887	7798	7798
q19	1655	1460	1425	1425
q20	2033	1774	1863	1774
q21	5152	4984	4987	4984
q22	1096	1026	1029	1026
Total cold run time: 65434 ms
Total hot run time: 58700 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197771 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c4724abb0a738288e041224dc080e33f70cf6fec, data reload: false

query1	1269	900	898	898
query2	6226	1919	1929	1919
query3	10819	4479	4502	4479
query4	32844	23644	23409	23409
query5	3535	480	451	451
query6	264	174	173	173
query7	3981	308	330	308
query8	289	241	238	238
query9	9477	2609	2594	2594
query10	453	271	266	266
query11	18181	15320	15154	15154
query12	162	111	107	107
query13	1563	430	416	416
query14	9831	7129	7045	7045
query15	262	184	197	184
query16	8035	492	504	492
query17	1603	604	609	604
query18	2113	318	322	318
query19	224	173	163	163
query20	136	116	115	115
query21	205	108	111	108
query22	4693	4563	4414	4414
query23	35378	34260	34928	34260
query24	11322	2943	2950	2943
query25	576	457	428	428
query26	960	184	176	176
query27	2687	367	354	354
query28	7829	2188	2164	2164
query29	689	455	458	455
query30	268	157	167	157
query31	1065	812	839	812
query32	94	63	56	56
query33	777	306	331	306
query34	937	540	549	540
query35	899	757	730	730
query36	1139	958	968	958
query37	118	72	69	69
query38	4048	3973	3976	3973
query39	1504	1502	1461	1461
query40	268	99	101	99
query41	47	47	48	47
query42	117	103	105	103
query43	541	480	488	480
query44	1281	832	844	832
query45	188	166	169	166
query46	1164	725	739	725
query47	2003	1878	1895	1878
query48	440	341	355	341
query49	981	427	398	398
query50	821	428	450	428
query51	7354	7292	7342	7292
query52	101	94	95	94
query53	273	196	189	189
query54	1211	471	488	471
query55	81	80	81	80
query56	276	264	253	253
query57	1294	1204	1188	1188
query58	244	213	220	213
query59	3253	3032	3048	3032
query60	302	278	278	278
query61	114	110	115	110
query62	873	692	681	681
query63	228	210	194	194
query64	5002	677	650	650
query65	3348	3293	3283	3283
query66	1036	304	315	304
query67	16146	15529	15392	15392
query68	4899	594	586	586
query69	435	267	273	267
query70	1183	1155	1095	1095
query71	343	265	268	265
query72	5961	4004	4001	4001
query73	745	360	362	360
query74	10698	9326	9302	9302
query75	3361	2671	2715	2671
query76	2637	1257	1032	1032
query77	377	289	263	263
query78	10489	9553	9614	9553
query79	1739	601	598	598
query80	1140	439	433	433
query81	547	223	221	221
query82	940	94	91	91
query83	245	165	150	150
query84	249	86	85	85
query85	1419	378	375	375
query86	398	308	289	289
query87	4366	4217	4250	4217
query88	3658	2435	2414	2414
query89	417	287	290	287
query90	1898	191	193	191
query91	147	110	117	110
query92	62	55	58	55
query93	1906	564	552	552
query94	815	283	292	283
query95	364	264	264	264
query96	615	284	281	281
query97	3307	3096	3163	3096
query98	224	206	195	195
query99	1512	1291	1307	1291
Total cold run time: 302538 ms
Total hot run time: 197771 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.56 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c4724abb0a738288e041224dc080e33f70cf6fec, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.04	0.03
query3	0.23	0.07	0.06
query4	1.62	0.11	0.11
query5	0.53	0.52	0.53
query6	1.13	0.72	0.72
query7	0.02	0.01	0.01
query8	0.04	0.04	0.03
query9	0.55	0.51	0.50
query10	0.54	0.54	0.54
query11	0.14	0.10	0.10
query12	0.13	0.11	0.12
query13	0.61	0.59	0.59
query14	0.78	0.78	0.79
query15	0.84	0.84	0.81
query16	0.37	0.37	0.39
query17	1.08	1.07	1.07
query18	0.24	0.22	0.21
query19	1.98	1.72	1.89
query20	0.02	0.01	0.01
query21	15.49	0.59	0.57
query22	2.46	2.02	2.15
query23	16.96	1.01	0.85
query24	3.00	1.46	1.74
query25	0.22	0.07	0.07
query26	0.59	0.14	0.14
query27	0.04	0.04	0.06
query28	9.43	0.52	0.44
query29	12.57	3.12	3.14
query30	0.24	0.05	0.05
query31	2.86	0.39	0.38
query32	3.25	0.46	0.46
query33	2.97	2.95	3.01
query34	16.80	4.46	4.49
query35	4.54	4.52	4.47
query36	0.67	0.46	0.47
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.12	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.51 s
Total hot run time: 30.56 s

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39915 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 9a421d295e85a940b2645ffacfb9337dbc52615f, data reload: false

------ Round 1 ----------------------------------
q1	17591	7007	6610	6610
q2	2088	182	165	165
q3	10705	1155	1134	1134
q4	10486	757	806	757
q5	8278	2900	2947	2900
q6	216	141	139	139
q7	990	637	644	637
q8	10132	1945	2015	1945
q9	8136	6401	6451	6401
q10	7156	2270	2248	2248
q11	450	264	263	263
q12	507	224	219	219
q13	17788	2996	3019	2996
q14	246	213	213	213
q15	513	470	461	461
q16	481	374	374	374
q17	984	527	610	527
q18	7099	6633	6662	6633
q19	1319	990	970	970
q20	466	200	203	200
q21	3946	3131	3200	3131
q22	1108	998	992	992
Total cold run time: 110685 ms
Total hot run time: 39915 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6663	6642	6629	6629
q2	327	234	225	225
q3	3111	2977	2917	2917
q4	2019	1787	1786	1786
q5	5679	5705	5690	5690
q6	215	128	130	128
q7	2167	1819	1731	1731
q8	3362	3496	3491	3491
q9	8785	8835	8837	8835
q10	3566	3510	3482	3482
q11	608	486	478	478
q12	847	600	592	592
q13	3812	3194	3196	3194
q14	292	257	277	257
q15	522	457	468	457
q16	478	444	433	433
q17	1821	1611	1599	1599
q18	8038	7782	7614	7614
q19	1670	1578	1468	1468
q20	2105	1846	1901	1846
q21	5089	4861	4901	4861
q22	1092	1014	1003	1003
Total cold run time: 62268 ms
Total hot run time: 58716 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189778 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 9a421d295e85a940b2645ffacfb9337dbc52615f, data reload: false

query1	971	376	362	362
query2	6515	1992	1879	1879
query3	6713	219	226	219
query4	34205	23297	23281	23281
query5	4358	468	461	461
query6	303	183	188	183
query7	4625	314	309	309
query8	294	232	231	231
query9	9692	2592	2603	2592
query10	480	275	264	264
query11	18342	15450	15143	15143
query12	157	106	103	103
query13	1642	446	418	418
query14	9817	6725	7316	6725
query15	252	174	187	174
query16	8131	444	472	444
query17	1631	579	578	578
query18	2138	309	314	309
query19	308	158	158	158
query20	117	106	108	106
query21	207	104	104	104
query22	4528	4230	4245	4230
query23	33934	33127	33443	33127
query24	11454	2892	2932	2892
query25	672	421	414	414
query26	1677	174	169	169
query27	2916	344	344	344
query28	7865	2113	2122	2113
query29	1001	447	445	445
query30	328	162	159	159
query31	1042	796	812	796
query32	95	61	63	61
query33	807	304	312	304
query34	936	493	529	493
query35	846	715	728	715
query36	1096	967	958	958
query37	151	74	71	71
query38	3977	3852	3873	3852
query39	1577	1421	1410	1410
query40	294	106	103	103
query41	54	52	53	52
query42	120	108	103	103
query43	532	497	479	479
query44	1300	802	806	802
query45	189	176	170	170
query46	1141	742	746	742
query47	1952	1798	1816	1798
query48	435	348	357	348
query49	1269	411	406	406
query50	821	418	414	414
query51	7333	7148	6996	6996
query52	105	96	97	96
query53	274	192	194	192
query54	1300	504	471	471
query55	78	79	83	79
query56	290	274	250	250
query57	1269	1156	1155	1155
query58	247	213	234	213
query59	3213	2856	2923	2856
query60	290	264	263	263
query61	114	109	109	109
query62	862	685	686	685
query63	217	199	200	199
query64	5253	640	633	633
query65	3266	3200	3223	3200
query66	1435	327	319	319
query67	15858	15395	15429	15395
query68	4417	576	600	576
query69	423	274	271	271
query70	1209	1149	1118	1118
query71	342	254	279	254
query72	6384	4021	3982	3982
query73	758	343	362	343
query74	10441	9218	9170	9170
query75	3362	2645	2647	2645
query76	2769	1134	1074	1074
query77	391	300	273	273
query78	10515	9498	9568	9498
query79	2103	613	609	609
query80	1135	452	437	437
query81	543	226	228	226
query82	924	93	88	88
query83	225	148	152	148
query84	231	81	82	81
query85	1334	314	294	294
query86	414	313	301	301
query87	4365	4197	4305	4197
query88	3915	2409	2407	2407
query89	415	296	290	290
query90	2083	190	192	190
query91	144	111	107	107
query92	68	53	54	53
query93	1489	565	554	554
query94	1002	297	285	285
query95	358	269	274	269
query96	605	292	282	282
query97	3247	3130	3161	3130
query98	213	204	200	200
query99	1521	1302	1303	1302
Total cold run time: 302656 ms
Total hot run time: 189778 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.48 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 9a421d295e85a940b2645ffacfb9337dbc52615f, data reload: false

query1	0.03	0.03	0.04
query2	0.06	0.04	0.03
query3	0.23	0.07	0.07
query4	1.64	0.10	0.10
query5	0.52	0.53	0.49
query6	1.14	0.73	0.72
query7	0.02	0.01	0.01
query8	0.04	0.04	0.03
query9	0.56	0.50	0.51
query10	0.54	0.53	0.55
query11	0.14	0.10	0.10
query12	0.14	0.11	0.11
query13	0.61	0.60	0.60
query14	0.79	0.79	0.80
query15	0.84	0.85	0.82
query16	0.41	0.37	0.37
query17	1.09	1.00	0.99
query18	0.24	0.21	0.23
query19	1.83	1.76	1.73
query20	0.02	0.01	0.01
query21	15.40	0.58	0.58
query22	2.88	1.60	1.76
query23	17.10	0.89	0.76
query24	3.19	1.57	0.80
query25	0.33	0.11	0.05
query26	0.38	0.13	0.15
query27	0.05	0.04	0.03
query28	10.07	0.53	0.48
query29	12.66	3.21	3.16
query30	0.25	0.06	0.05
query31	2.88	0.39	0.39
query32	3.24	0.46	0.46
query33	2.97	2.96	3.01
query34	17.18	4.50	4.46
query35	4.61	4.56	4.57
query36	0.64	0.47	0.50
query37	0.08	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.16	0.13	0.12
query41	0.07	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.17 s
Total hot run time: 29.48 s

@morningman morningman changed the title [feat](refactor-param) refactor brokerLoad's code about fileformat (#50882) [feat](refactor-param) refactor brokerLoad's code about fileformat (#50882)(#53159) Jul 13, 2025
@morrySnow morrySnow changed the title [feat](refactor-param) refactor brokerLoad's code about fileformat (#50882)(#53159) branch-3.1: [feat](refactor-param) refactor brokerLoad's code about fileformat #50882 #53159 Jul 14, 2025
@morrySnow morrySnow merged commit b0f0573 into apache:branch-3.1 Jul 14, 2025
23 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants