Skip to content

Conversation

@ByteYue
Copy link
Contributor

@ByteYue ByteYue commented Dec 25, 2023

Proposed changes

Issue Number: close #xxx

The former implementation of the S3 buffer pool is not elastic, especially on cloud mode where all the load operation would access S3 then the static memory allocation strategy might block the load operation or it might cause dead lock in situation like vertical compaction.

TODO:

  1. add the mock s3 client ut

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@ByteYue
Copy link
Contributor Author

ByteYue commented Dec 25, 2023

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@ByteYue
Copy link
Contributor Author

ByteYue commented Dec 25, 2023

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit e3658a1ead55dc53921aafcddfcdf54b8e91111a, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4718	4413	4419	4413
q2	369	148	156	148
q3	1463	1272	1210	1210
q4	1099	868	913	868
q5	3150	3196	3169	3169
q6	254	129	128	128
q7	1001	488	496	488
q8	2205	2223	2195	2195
q9	6730	6640	6675	6640
q10	3215	3288	3275	3275
q11	306	197	201	197
q12	353	205	205	205
q13	4546	3756	3829	3756
q14	243	216	220	216
q15	566	524	526	524
q16	446	392	380	380
q17	1011	648	551	551
q18	7102	6820	6972	6820
q19	1520	1409	1448	1409
q20	507	317	314	314
q21	3097	2695	2695	2695
q22	352	284	286	284
Total cold run time: 44253 ms
Total hot run time: 39885 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4344	4348	4334	4334
q2	269	166	180	166
q3	3515	3508	3490	3490
q4	2409	2364	2379	2364
q5	5695	5706	5706	5706
q6	243	125	122	122
q7	2378	1886	1861	1861
q8	3534	3538	3526	3526
q9	9014	8998	8973	8973
q10	3911	3987	3995	3987
q11	488	375	371	371
q12	766	608	612	608
q13	4289	3569	3546	3546
q14	290	261	251	251
q15	575	526	517	517
q16	493	442	475	442
q17	1883	1886	1853	1853
q18	8549	8147	8191	8147
q19	1739	1723	1770	1723
q20	2253	1942	1940	1940
q21	6489	6156	6153	6153
q22	519	427	436	427
Total cold run time: 63645 ms
Total hot run time: 60507 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.01 seconds
stream load tsv: 562 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 33.5 seconds inserted 10000000 Rows, about 298K ops/s
storage size: 17183640323 Bytes

@ByteYue
Copy link
Contributor Author

ByteYue commented Dec 27, 2023

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 4ca0f38b9527e4bafe9cf8a507fd067efd60b1e7, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4756	4460	4444	4444
q2	382	173	158	158
q3	1483	1352	1296	1296
q4	1150	971	896	896
q5	3159	3170	3169	3169
q6	263	135	135	135
q7	1037	499	485	485
q8	2258	2265	2243	2243
q9	6737	6687	6682	6682
q10	3196	3292	3279	3279
q11	336	205	206	205
q12	354	210	208	208
q13	4530	3786	3805	3786
q14	242	214	216	214
q15	561	520	518	518
q16	446	384	384	384
q17	1043	801	722	722
q18	7138	6822	6832	6822
q19	1623	1636	1793	1636
q20	575	324	301	301
q21	3244	2737	2778	2737
q22	372	300	309	300
Total cold run time: 44885 ms
Total hot run time: 40620 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4373	4412	4408	4408
q2	273	170	170	170
q3	3513	3503	3489	3489
q4	2431	2411	2413	2411
q5	5710	5728	5712	5712
q6	249	127	127	127
q7	2371	1859	1867	1859
q8	3613	3610	3597	3597
q9	9032	8996	8958	8958
q10	3911	4009	3997	3997
q11	485	383	360	360
q12	776	595	592	592
q13	4291	3548	3592	3548
q14	291	253	257	253
q15	578	518	521	518
q16	525	459	460	459
q17	1997	1971	1959	1959
q18	8766	8065	8100	8065
q19	1862	1836	1838	1836
q20	2243	1971	1938	1938
q21	6644	6238	6204	6204
q22	539	470	461	461
Total cold run time: 64473 ms
Total hot run time: 60921 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.36 seconds
stream load tsv: 563 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s
storage size: 17183671416 Bytes

@ByteYue
Copy link
Contributor Author

ByteYue commented Dec 27, 2023

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.55% (8582/23478)
Line Coverage: 28.65% (69740/243450)
Region Coverage: 27.62% (36043/130507)
Branch Coverage: 24.35% (18412/75626)
Coverage Report: http://coverage.selectdb-in.cc/coverage/936e80284b827e8019351bd00bebab57cda0b8cb_936e80284b827e8019351bd00bebab57cda0b8cb/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.02 seconds
stream load tsv: 567 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17183383093 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit 936e80284b827e8019351bd00bebab57cda0b8cb, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5021	4643	4650	4643
q2	381	172	158	158
q3	1464	1279	1183	1183
q4	1132	975	895	895
q5	3150	3164	3161	3161
q6	249	127	125	125
q7	1023	496	499	496
q8	2272	2257	2228	2228
q9	6696	6675	6702	6675
q10	3210	3267	3273	3267
q11	335	216	207	207
q12	358	202	209	202
q13	4157	3406	3434	3406
q14	241	214	220	214
q15	569	518	514	514
q16	438	387	381	381
q17	1051	794	615	615
q18	7060	6860	6820	6820
q19	1604	1641	1655	1641
q20	538	312	300	300
q21	3210	2710	2715	2710
q22	368	296	307	296
Total cold run time: 44527 ms
Total hot run time: 40137 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4634	4590	4544	4544
q2	267	163	167	163
q3	3375	3374	3348	3348
q4	2222	2186	2190	2186
q5	5720	5718	5703	5703
q6	242	117	117	117
q7	2370	1863	1856	1856
q8	3606	3634	3613	3613
q9	8986	8961	8966	8961
q10	3812	3885	3886	3885
q11	476	374	363	363
q12	766	607	591	591
q13	3916	3210	3200	3200
q14	288	253	264	253
q15	575	524	515	515
q16	497	446	477	446
q17	1971	1954	1954	1954
q18	8775	8236	8292	8236
q19	1756	1736	1735	1735
q20	2238	1932	1937	1932
q21	6107	5775	5766	5766
q22	536	453	447	447
Total cold run time: 63135 ms
Total hot run time: 59814 ms

Copy link
Contributor

@gavinchou gavinchou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2024

PR approved by anyone and no changes requested.

dataroaring
dataroaring previously approved these changes Jan 6, 2024
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ByteYue
Copy link
Contributor Author

ByteYue commented Jan 6, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38772 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 78e364e3557d7d32c8359a0802f62291c19de15c, data reload: false

------ Round 1 ----------------------------------
q1	17714	4963	4984	4963
q2	2017	152	143	143
q3	10587	1122	1163	1122
q4	10198	797	856	797
q5	7781	2987	2927	2927
q6	210	120	123	120
q7	929	511	501	501
q8	9329	2009	2026	2009
q9	6480	6393	6365	6365
q10	8270	3099	3080	3080
q11	427	219	215	215
q12	397	232	236	232
q13	18009	3389	3370	3370
q14	243	206	211	206
q15	547	514	506	506
q16	457	391	403	391
q17	954	732	599	599
q18	7383	6730	6689	6689
q19	1564	1515	1520	1515
q20	728	338	331	331
q21	2785	2460	2373	2373
q22	357	318	322	318
Total cold run time: 107366 ms
Total hot run time: 38772 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5039	4975	4947	4947
q2	338	232	232	232
q3	3302	3294	3294	3294
q4	2133	2029	1999	1999
q5	5798	5803	5786	5786
q6	207	118	115	115
q7	2317	1919	1891	1891
q8	3371	3450	3462	3450
q9	8831	8779	8751	8751
q10	3769	3848	3805	3805
q11	542	469	448	448
q12	799	642	632	632
q13	7283	3179	3169	3169
q14	315	283	248	248
q15	537	498	497	497
q16	553	491	469	469
q17	1888	1859	1864	1859
q18	8623	8460	8349	8349
q19	1605	1682	1625	1625
q20	2176	1965	1972	1965
q21	5587	5359	5181	5181
q22	505	458	447	447
Total cold run time: 65518 ms
Total hot run time: 59159 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.68% (8636/23543)
Line Coverage: 28.69% (70177/244581)
Region Coverage: 27.65% (36297/131260)
Branch Coverage: 24.33% (18539/76188)
Coverage Report: http://coverage.selectdb-in.cc/coverage/78e364e3557d7d32c8359a0802f62291c19de15c_78e364e3557d7d32c8359a0802f62291c19de15c/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 184171 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 78e364e3557d7d32c8359a0802f62291c19de15c, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	942	344	332	332
query2	6586	1946	1881	1881
query3	6720	223	224	223
query4	25879	22545	22602	22545
query5	5907	630	591	591
query6	335	210	203	203
query7	4631	306	309	306
query8	252	220	220	220
query9	8659	2915	2924	2915
query10	570	330	338	330
query11	16258	15771	15583	15583
query12	176	98	96	96
query13	1659	343	330	330
query14	12388	7651	7706	7651
query15	295	245	216	216
query16	6333	341	332	332
query17	1715	516	522	516
query18	1963	305	295	295
query19	231	159	161	159
query20	115	107	103	103
query21	185	100	96	96
query22	4748	4788	4874	4788
query23	32354	31111	31124	31111
query24	12245	2970	2924	2924
query25	604	360	379	360
query26	1791	170	165	165
query27	2774	312	323	312
query28	6577	2139	2130	2130
query29	2031	419	418	418
query30	292	145	147	145
query31	1032	825	800	800
query32	129	83	85	83
query33	833	330	349	330
query34	846	469	478	469
query35	1000	855	877	855
query36	1436	1290	1203	1203
query37	202	83	90	83
query38	3525	3352	3331	3331
query39	1372	1308	1329	1308
query40	308	91	94	91
query41	39	36	35	35
query42	115	100	108	100
query43	589	525	519	519
query44	1264	834	830	830
query45	226	199	206	199
query46	1091	679	716	679
query47	1798	1627	1683	1627
query48	359	302	284	284
query49	1229	312	304	304
query50	802	332	339	332
query51	5389	5451	5378	5378
query52	109	96	101	96
query53	241	167	166	166
query54	1428	690	654	654
query55	109	102	100	100
query56	273	258	252	252
query57	1078	955	958	955
query58	285	288	286	286
query59	3079	2880	2913	2880
query60	373	296	305	296
query61	154	148	125	125
query62	577	474	484	474
query63	191	176	168	168
query64	5888	1756	1746	1746
query65	3415	3364	3351	3351
query66	1358	396	399	396
query67	15416	15757	15441	15441
query68	12049	542	555	542
query69	587	315	332	315
query70	1698	1771	1579	1579
query71	595	298	284	284
query72	5297	3453	3434	3434
query73	2470	351	342	342
query74	6974	6514	6568	6514
query75	5069	2339	2295	2295
query76	6312	1068	1187	1068
query77	744	369	341	341
query78	9081	8631	8640	8631
query79	1058	540	521	521
query80	526	350	338	338
query81	460	216	218	216
query82	233	114	101	101
query83	170	140	139	139
query84	253	56	57	56
query85	917	278	268	268
query86	428	425	385	385
query87	3613	3454	3415	3415
query88	2925	2482	2491	2482
query89	359	309	290	290
query90	1867	227	227	227
query91	123	96	95	95
query92	97	80	67	67
query93	1026	449	427	427
query94	843	243	235	235
query95	562	507	495	495
query96	636	340	343	340
query97	4376	4176	4228	4176
query98	243	224	212	212
query99	1185	839	863	839
Total cold run time: 295549 ms
Total hot run time: 184171 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.32 seconds
stream load tsv: 572 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 26.8 seconds inserted 10000000 Rows, about 373K ops/s
storage size: 17183980523 Bytes

@dataroaring
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38676 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 995b21e3d010c748c2550bcb2d3cfb1225ccd099, data reload: false

------ Round 1 ----------------------------------
q1	17749	4926	4942	4926
q2	2027	153	142	142
q3	10587	1120	1143	1120
q4	10205	780	860	780
q5	7800	3026	2985	2985
q6	201	124	121	121
q7	924	518	497	497
q8	9263	2017	2014	2014
q9	6504	6427	6429	6427
q10	8275	3118	3061	3061
q11	436	211	213	211
q12	396	227	225	225
q13	17999	3385	3337	3337
q14	239	203	214	203
q15	547	516	499	499
q16	452	399	388	388
q17	968	760	607	607
q18	7322	6679	6697	6679
q19	1570	1522	1509	1509
q20	741	313	341	313
q21	2748	2453	2326	2326
q22	359	312	306	306
Total cold run time: 107312 ms
Total hot run time: 38676 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4947	4940	4951	4940
q2	347	235	235	235
q3	3298	3299	3271	3271
q4	2125	2093	2093	2093
q5	6062	6122	5884	5884
q6	205	114	115	114
q7	2345	1924	1936	1924
q8	3405	3486	3482	3482
q9	8833	8833	8870	8833
q10	3795	3848	3841	3841
q11	544	444	447	444
q12	799	649	649	649
q13	4496	3169	3174	3169
q14	278	263	260	260
q15	548	500	509	500
q16	545	490	472	472
q17	1886	1870	1877	1870
q18	8686	8371	8227	8227
q19	1593	1632	1629	1629
q20	2211	1943	1943	1943
q21	5568	5240	5301	5240
q22	532	453	439	439
Total cold run time: 63048 ms
Total hot run time: 59459 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184357 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 995b21e3d010c748c2550bcb2d3cfb1225ccd099, data reload: false

run tpcds-sf100 query with default conf and session variables
query1	941	344	339	339
query2	6574	2012	1954	1954
query3	6720	230	220	220
query4	26040	22516	22522	22516
query5	6217	607	636	607
query6	349	210	208	208
query7	4635	306	307	306
query8	253	235	226	226
query9	8709	2888	2890	2888
query10	553	295	327	295
query11	15844	15658	15676	15658
query12	170	96	96	96
query13	1682	355	334	334
query14	12906	7515	7472	7472
query15	302	212	204	204
query16	6386	331	321	321
query17	1700	529	498	498
query18	1968	321	291	291
query19	252	158	158	158
query20	103	102	106	102
query21	187	98	98	98
query22	5118	5069	5125	5069
query23	32280	31256	31239	31239
query24	12022	2934	2910	2910
query25	591	354	354	354
query26	1759	156	167	156
query27	2987	312	325	312
query28	7000	2119	2128	2119
query29	2042	410	422	410
query30	289	143	145	143
query31	1041	829	808	808
query32	138	87	81	81
query33	848	334	340	334
query34	926	467	470	467
query35	1009	851	839	839
query36	1375	1307	1327	1307
query37	210	92	95	92
query38	3475	3378	3338	3338
query39	1352	1308	1294	1294
query40	303	85	87	85
query41	40	36	35	35
query42	110	104	101	101
query43	571	552	545	545
query44	1200	828	805	805
query45	213	207	205	205
query46	1082	715	719	715
query47	1800	1668	1680	1668
query48	357	287	273	273
query49	1220	317	305	305
query50	795	360	354	354
query51	5400	5251	5245	5245
query52	104	99	99	99
query53	235	167	163	163
query54	1478	643	692	643
query55	109	99	107	99
query56	288	267	262	262
query57	1088	993	966	966
query58	300	283	277	277
query59	3135	2857	2857	2857
query60	342	284	296	284
query61	150	127	136	127
query62	578	481	474	474
query63	190	180	178	178
query64	5909	1683	1696	1683
query65	3421	3319	3347	3319
query66	1308	388	367	367
query67	15725	15343	15374	15343
query68	12234	548	565	548
query69	601	335	299	299
query70	1848	1667	1592	1592
query71	605	295	294	294
query72	5254	3399	3402	3399
query73	2183	345	333	333
query74	7039	6525	6539	6525
query75	4966	2324	2251	2251
query76	6336	1152	1181	1152
query77	757	348	325	325
query78	9156	8862	8620	8620
query79	1033	532	528	528
query80	525	337	350	337
query81	463	210	211	210
query82	231	101	112	101
query83	163	144	142	142
query84	251	57	57	57
query85	911	267	259	259
query86	435	428	399	399
query87	3650	3432	3405	3405
query88	3126	2525	2538	2525
query89	365	275	292	275
query90	1846	262	251	251
query91	124	95	102	95
query92	88	78	79	78
query93	1024	510	434	434
query94	833	238	222	222
query95	575	510	488	488
query96	647	344	343	343
query97	4340	4316	4257	4257
query98	237	224	217	217
query99	1143	856	887	856
Total cold run time: 297393 ms
Total hot run time: 184357 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.97 seconds
stream load tsv: 570 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 27.1 seconds inserted 10000000 Rows, about 369K ops/s
storage size: 17183980541 Bytes

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.66% (8635/23554)
Line Coverage: 28.68% (70164/244653)
Region Coverage: 27.63% (36282/131313)
Branch Coverage: 24.32% (18538/76228)
Coverage Report: http://coverage.selectdb-in.cc/coverage/995b21e3d010c748c2550bcb2d3cfb1225ccd099_995b21e3d010c748c2550bcb2d3cfb1225ccd099/report/index.html

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 4ebbb36 into apache:master Jan 8, 2024
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
morningman pushed a commit to morningman/doris that referenced this pull request Feb 21, 2024
yiguolei pushed a commit that referenced this pull request Feb 21, 2024
…#28983 #30703 #31169 (#31213)

* (feature)(cloud) Use dynamic allocator instead of static buffer pool for better elasticity. (#28983)

* [fix](outfile) Fix unable to export empty data (#30703)

Issue Number: close #30600
Fix unable to export empty data to hdfs / S3, this behavior is inconsistent with version 1.2.7,
version 1.2.7 can export empty data to hdfs/ S3, and there will be exported files on S3/HDFS.

* [fix](file-writer) avoid empty file for segment writer (#31169)

---------

Co-authored-by: AlexYue <yj976240184@gmail.com>
Co-authored-by: zxealous <zhouchangyue@baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants