Skip to content

Conversation

@morningman
Copy link
Contributor

The inputFormat.isSplitable() method will create FileSystem.
Each FileSystem will register a lot hadoop metrics, which will take a lot memory.
This PR simplify it to avoid calling inputFormat.isSplitable().

Only for branch-2.0

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 49759 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3ac0d9b0e8442e0656e0d77c8751b9d97439b874, data reload: false

------ Round 1 ----------------------------------
q1	17871	4388	4296	4296
q2	2036	145	145	145
q3	10466	1895	1934	1895
q4	10323	1227	1300	1227
q5	8359	3941	3970	3941
q6	232	120	121	120
q7	2045	1574	1580	1574
q8	9272	2688	2701	2688
q9	10682	10509	10481	10481
q10	8641	3462	3466	3462
q11	423	244	238	238
q12	457	294	300	294
q13	18306	3947	4028	3947
q14	360	318	326	318
q15	505	467	453	453
q16	720	596	587	587
q17	1114	954	960	954
q18	7268	6833	6842	6833
q19	1692	1562	1552	1552
q20	511	297	321	297
q21	4437	4118	4054	4054
q22	486	407	403	403
Total cold run time: 116206 ms
Total hot run time: 49759 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4316	4310	4282	4282
q2	315	221	221	221
q3	4172	4201	4137	4137
q4	2751	2726	2750	2726
q5	7277	7197	7206	7197
q6	231	117	119	117
q7	3243	2864	2818	2818
q8	4325	4427	4451	4427
q9	17076	16999	16962	16962
q10	4231	4240	4255	4240
q11	778	676	665	665
q12	1024	858	826	826
q13	7434	3727	3747	3727
q14	453	447	430	430
q15	513	460	457	457
q16	764	714	703	703
q17	3817	3809	3815	3809
q18	8926	8796	8929	8796
q19	1690	1702	1632	1632
q20	2409	2126	2135	2126
q21	8487	8508	8531	8508
q22	1046	1030	986	986
Total cold run time: 85278 ms
Total hot run time: 79792 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 201735 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3ac0d9b0e8442e0656e0d77c8751b9d97439b874, data reload: false

query1	946	391	377	377
query2	6517	2199	1948	1948
query3	7028	213	208	208
query4	22548	18599	18684	18599
query5	21763	6600	6578	6578
query6	412	221	238	221
query7	5150	289	300	289
query8	263	231	224	224
query9	3142	2673	2631	2631
query10	431	285	299	285
query11	11372	10616	10800	10616
query12	122	71	71	71
query13	5574	643	638	638
query14	19603	13279	13492	13279
query15	365	230	235	230
query16	6455	271	255	255
query17	1320	1575	885	885
query18	2259	408	403	403
query19	200	142	140	140
query20	79	74	80	74
query21	192	90	92	90
query22	5120	5099	5111	5099
query23	32485	31800	31798	31798
query24	6826	6519	6474	6474
query25	507	410	423	410
query26	516	165	153	153
query27	1746	295	288	288
query28	6128	2249	2221	2221
query29	2906	2735	2695	2695
query30	237	163	160	160
query31	921	740	727	727
query32	69	59	60	59
query33	385	262	248	248
query34	855	455	463	455
query35	1114	920	948	920
query36	1213	1358	1343	1343
query37	88	59	62	59
query38	3085	2886	2944	2886
query39	1361	1318	1312	1312
query40	197	90	95	90
query41	35	32	33	32
query42	91	96	86	86
query43	564	654	655	654
query44	1129	714	720	714
query45	240	229	228	228
query46	1237	970	966	966
query47	1815	1683	1624	1624
query48	990	662	658	658
query49	623	360	371	360
query50	881	584	581	581
query51	4785	4655	4627	4627
query52	75	86	74	74
query53	449	314	311	311
query54	2658	2464	2430	2430
query55	92	75	83	75
query56	215	205	206	205
query57	1215	1166	1141	1141
query58	209	192	207	192
query59	3515	3129	3214	3129
query60	217	184	202	184
query61	89	86	90	86
query62	821	462	494	462
query63	466	329	331	329
query64	2482	1502	1362	1362
query65	3613	3605	3526	3526
query66	748	366	373	366
query67	17941	15311	15026	15026
query68	8493	647	677	647
query69	579	345	352	345
query70	1612	1448	1532	1448
query71	415	311	314	311
query72	6583	3461	3438	3438
query73	745	319	330	319
query74	6408	5852	5923	5852
query75	4678	3657	3684	3657
query76	4768	1189	1214	1189
query77	666	248	251	248
query78	12463	11765	11897	11765
query79	7551	634	644	634
query80	884	401	386	386
query81	494	233	233	233
query82	1545	98	93	93
query83	170	136	135	135
query84	246	70	68	68
query85	847	282	285	282
query86	313	314	285	285
query87	3238	3023	3055	3023
query88	4468	2385	2380	2380
query89	390	284	278	278
query90	1924	195	210	195
query91	157	115	117	115
query92	58	49	54	49
query93	6297	541	576	541
query94	752	203	201	201
query95	1120	1058	1067	1058
query96	658	335	330	330
query97	6360	6415	6383	6383
query98	188	172	174	172
query99	2829	938	809	809
Total cold run time: 316594 ms
Total hot run time: 201735 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.85 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 3ac0d9b0e8442e0656e0d77c8751b9d97439b874, data reload: false

query1	0.02	0.02	0.02
query2	0.06	0.02	0.02
query3	0.25	0.05	0.04
query4	1.80	0.07	0.06
query5	0.53	0.53	0.52
query6	1.23	0.61	0.61
query7	0.01	0.01	0.01
query8	0.03	0.02	0.02
query9	0.52	0.50	0.48
query10	0.54	0.53	0.54
query11	0.12	0.09	0.08
query12	0.13	0.10	0.10
query13	0.63	0.62	0.61
query14	0.79	0.77	0.78
query15	0.79	0.76	0.75
query16	0.37	0.36	0.36
query17	1.00	1.00	1.00
query18	0.24	0.24	0.25
query19	1.93	1.84	1.84
query20	0.02	0.01	0.01
query21	15.47	0.59	0.56
query22	2.23	2.14	1.62
query23	17.28	0.93	0.98
query24	4.86	1.23	1.28
query25	0.34	0.11	0.05
query26	0.62	0.15	0.16
query27	0.04	0.03	0.05
query28	7.76	0.71	0.71
query29	12.60	2.24	2.24
query30	0.63	0.53	0.49
query31	2.81	0.39	0.38
query32	3.39	0.50	0.49
query33	3.11	3.05	3.07
query34	15.25	4.79	4.78
query35	4.85	4.82	4.83
query36	1.06	1.01	1.01
query37	0.06	0.05	0.04
query38	0.03	0.02	0.02
query39	0.03	0.02	0.01
query40	0.16	0.14	0.15
query41	0.06	0.01	0.01
query42	0.02	0.01	0.01
query43	0.02	0.01	0.02
Total cold run time: 103.69 s
Total hot run time: 30.85 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 3ac0d9b0e8442e0656e0d77c8751b9d97439b874 with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      31 seconds loaded 861443392 Bytes, about 26 MB/s
Insert into select:       20.5 seconds inserted 10000000 Rows, about 487K ops/s

@morningman morningman merged commit 0968395 into apache:branch-2.0 Apr 4, 2024
morningman added a commit that referenced this pull request May 18, 2024
…ble (#35029)

Introduced from #33242
When we check supported inputformat in a Set<String>, we should use string, not object
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
…metrics (apache#33242)

The inputFormat.isSplitable() method will create FileSystem.
Each FileSystem will register a lot hadoop metrics, which will take a lot memory.
This PR simplify it to avoid calling inputFormat.isSplitable().

Only for branch-2.0
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
…ble (apache#35029)

Introduced from apache#33242
When we check supported inputformat in a Set<String>, we should use string, not object
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants