Skip to content

Conversation

@wuwenchi
Copy link
Contributor

@wuwenchi wuwenchi commented Apr 16, 2025

What problem does this PR solve?

FollowUp #44038

Problem Summary:

We need to set the target size for all splits so that we can calculate the proportion of each split later.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@wuwenchi wuwenchi marked this pull request as ready for review April 17, 2025 12:35
@wuwenchi
Copy link
Contributor Author

run buildall

@wuwenchi
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34038 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5b3af459be84d915a646649c2a7d637ee9cac8ee, data reload: false

------ Round 1 ----------------------------------
q1	26218	5067	5058	5058
q2	2074	272	194	194
q3	10484	1234	714	714
q4	10242	1023	547	547
q5	7669	2421	2326	2326
q6	182	160	131	131
q7	923	730	601	601
q8	9310	1292	1130	1130
q9	6797	5068	5146	5068
q10	6828	2313	1885	1885
q11	482	275	262	262
q12	359	357	214	214
q13	17795	3692	3118	3118
q14	230	232	215	215
q15	536	482	479	479
q16	441	443	408	408
q17	623	885	363	363
q18	7602	7125	7195	7125
q19	1659	982	550	550
q20	327	323	212	212
q21	4192	2613	2470	2470
q22	1102	1061	968	968
Total cold run time: 116075 ms
Total hot run time: 34038 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5213	5082	5086	5082
q2	243	329	238	238
q3	2219	2682	2310	2310
q4	1468	1836	1455	1455
q5	4498	4506	4434	4434
q6	218	171	128	128
q7	2023	1938	1751	1751
q8	2685	2604	2558	2558
q9	7283	7224	6994	6994
q10	3034	3164	2743	2743
q11	586	509	508	508
q12	685	759	613	613
q13	3588	3888	3350	3350
q14	285	316	285	285
q15	526	480	474	474
q16	448	500	456	456
q17	1195	1618	1359	1359
q18	7757	7603	7671	7603
q19	829	857	916	857
q20	2004	1953	1825	1825
q21	5303	4982	4856	4856
q22	1154	1089	1021	1021
Total cold run time: 53244 ms
Total hot run time: 50900 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192001 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5b3af459be84d915a646649c2a7d637ee9cac8ee, data reload: false

query1	1409	1107	1059	1059
query2	6305	1804	1785	1785
query3	11157	4700	4531	4531
query4	25691	23411	23094	23094
query5	3959	620	459	459
query6	322	210	205	205
query7	3991	495	274	274
query8	300	249	268	249
query9	8519	2501	2507	2501
query10	475	315	264	264
query11	15229	14925	14746	14746
query12	147	108	107	107
query13	1558	521	389	389
query14	8875	6164	6448	6164
query15	203	186	166	166
query16	7322	675	477	477
query17	1159	709	564	564
query18	1963	412	300	300
query19	192	183	161	161
query20	165	117	117	117
query21	210	123	115	115
query22	4499	4625	4327	4327
query23	34265	33566	33483	33483
query24	8533	2438	2411	2411
query25	516	481	405	405
query26	1316	271	151	151
query27	2758	515	325	325
query28	4810	2127	2116	2116
query29	733	583	438	438
query30	277	215	196	196
query31	900	891	809	809
query32	76	65	64	64
query33	539	366	350	350
query34	828	892	520	520
query35	827	844	748	748
query36	985	1043	931	931
query37	115	107	79	79
query38	4257	4200	4167	4167
query39	1493	1441	1459	1441
query40	211	118	102	102
query41	57	54	50	50
query42	123	109	114	109
query43	502	509	483	483
query44	1325	830	823	823
query45	185	177	174	174
query46	849	1030	647	647
query47	1775	1869	1802	1802
query48	400	418	301	301
query49	758	500	440	440
query50	697	701	402	402
query51	4202	4250	4319	4250
query52	116	109	98	98
query53	230	259	192	192
query54	584	584	506	506
query55	82	84	86	84
query56	310	292	284	284
query57	1187	1177	1099	1099
query58	268	266	256	256
query59	2723	2757	2616	2616
query60	350	351	324	324
query61	161	154	162	154
query62	808	786	712	712
query63	236	194	192	192
query64	4181	1160	782	782
query65	4407	4339	4360	4339
query66	1101	407	302	302
query67	16094	15623	15556	15556
query68	9582	833	514	514
query69	463	294	264	264
query70	1227	1165	1093	1093
query71	466	316	297	297
query72	5410	4638	4636	4636
query73	707	577	348	348
query74	9227	8899	8600	8600
query75	4347	3344	2685	2685
query76	3745	1179	753	753
query77	952	372	287	287
query78	10009	10179	9397	9397
query79	1843	806	554	554
query80	698	507	436	436
query81	470	256	218	218
query82	422	120	98	98
query83	277	246	223	223
query84	284	110	77	77
query85	779	415	310	310
query86	325	311	281	281
query87	4358	4383	4327	4327
query88	3199	2214	2174	2174
query89	405	311	275	275
query90	1852	213	208	208
query91	139	151	123	123
query92	70	60	55	55
query93	1296	966	570	570
query94	648	407	303	303
query95	378	292	289	289
query96	489	567	268	268
query97	3134	3191	3115	3115
query98	219	205	202	202
query99	1465	1415	1254	1254
Total cold run time: 279685 ms
Total hot run time: 192001 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.58 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5b3af459be84d915a646649c2a7d637ee9cac8ee, data reload: false

query1	0.03	0.04	0.03
query2	0.11	0.10	0.11
query3	0.25	0.19	0.20
query4	1.59	0.19	0.20
query5	0.59	0.57	0.57
query6	1.23	0.72	0.72
query7	0.02	0.02	0.02
query8	0.04	0.04	0.04
query9	0.59	0.51	0.52
query10	0.57	0.56	0.56
query11	0.15	0.10	0.11
query12	0.15	0.11	0.11
query13	0.62	0.59	0.60
query14	1.16	1.17	1.15
query15	0.89	0.85	0.84
query16	0.39	0.37	0.36
query17	1.06	1.06	1.04
query18	0.22	0.20	0.20
query19	1.87	1.76	1.72
query20	0.02	0.02	0.01
query21	15.39	0.90	0.55
query22	0.75	1.20	0.67
query23	14.95	1.43	0.63
query24	7.19	0.98	0.85
query25	0.48	0.18	0.15
query26	0.61	0.16	0.14
query27	0.05	0.04	0.05
query28	9.33	0.87	0.44
query29	12.54	3.98	3.29
query30	0.24	0.09	0.06
query31	2.81	0.60	0.38
query32	3.22	0.55	0.46
query33	3.01	3.02	3.06
query34	15.76	5.10	4.51
query35	4.55	4.56	4.54
query36	0.68	0.49	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.16	0.14	0.13
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 103.56 s
Total hot run time: 29.58 s

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 21, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit a36a572 into apache:master Apr 23, 2025
30 checks passed
dataroaring pushed a commit that referenced this pull request Apr 29, 2025
wuwenchi added a commit to wuwenchi/doris that referenced this pull request May 6, 2025
yiguolei pushed a commit that referenced this pull request May 7, 2025
@yiguolei yiguolei mentioned this pull request May 13, 2025
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
### What problem does this PR solve?

Problem Summary:

We need to set the target size for all splits so that we can calculate
the proportion of each split later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.10-merged dev/3.0.6-merged p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants