Skip to content

Conversation

@github-actions
Copy link
Contributor

@github-actions github-actions bot commented Jul 8, 2025

Cherry-picked from #52887

…52887)

### What problem does this PR solve?

Routine load job could not transform RUNNING to NEED_SCHEDULE, when
partition num increase and reschedule job, it will throw exception,
causing new partition can not consume:
```
2025-07-07 14:35:39,847 WARN (Routine load scheduler|41) [RoutineLoadScheduler.runAfterCatalogReady():59] Failed to process one round of RoutineLoadScheduler
org.apache.doris.common.DdlException: errCode = 2, detailMessage = Could not transform RUNNING to NEED_SCHEDULE
        at org.apache.doris.load.routineload.RoutineLoadJob.checkStateTransform(RoutineLoadJob.java:788) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.load.routineload.RoutineLoadJob.unprotectUpdateState(RoutineLoadJob.java:1366) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.load.routineload.RoutineLoadJob.update(RoutineLoadJob.java:1483) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.load.routineload.RoutineLoadManager.updateRoutineLoadJob(RoutineLoadManager.java:839) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.load.routineload.RoutineLoadScheduler.process(RoutineLoadScheduler.java:65) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.load.routineload.RoutineLoadScheduler.runAfterCatalogReady(RoutineLoadScheduler.java:57) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.2-SNAPSHOT]
```

introduced by #40728, and should
remove this limit.
@github-actions github-actions bot requested a review from morrySnow as a code owner July 8, 2025 01:56
@Thearas
Copy link
Contributor

Thearas commented Jul 8, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring closed this Jul 8, 2025
@dataroaring dataroaring reopened this Jul 8, 2025
@Thearas
Copy link
Contributor

Thearas commented Jul 8, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39388 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ad7cd6ac25c2bda21e64342dabdcde773a75982c, data reload: false

------ Round 1 ----------------------------------
q1	17590	6776	6574	6574
q2	2058	194	168	168
q3	10517	1113	1072	1072
q4	10222	753	719	719
q5	7711	2818	2778	2778
q6	212	133	134	133
q7	977	622	608	608
q8	9363	1904	2023	1904
q9	6625	6367	6377	6367
q10	7017	2257	2232	2232
q11	474	265	261	261
q12	413	218	210	210
q13	17780	2985	2987	2985
q14	252	213	213	213
q15	512	466	470	466
q16	473	374	376	374
q17	974	533	639	533
q18	7165	6649	6607	6607
q19	1318	1028	957	957
q20	466	198	197	197
q21	3871	3144	3021	3021
q22	1087	1012	1009	1009
Total cold run time: 107077 ms
Total hot run time: 39388 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6623	6566	6584	6566
q2	329	235	233	233
q3	2942	2876	2971	2876
q4	2106	1835	1898	1835
q5	5825	5816	5781	5781
q6	211	133	127	127
q7	2218	1852	1826	1826
q8	3404	3560	3503	3503
q9	9051	8898	8840	8840
q10	3554	3504	3565	3504
q11	596	501	492	492
q12	798	613	635	613
q13	9324	3168	3197	3168
q14	292	269	258	258
q15	503	465	473	465
q16	490	436	455	436
q17	1838	1617	1578	1578
q18	8132	7765	7819	7765
q19	1678	1626	1648	1626
q20	2139	1817	1822	1817
q21	5036	4971	5003	4971
q22	1137	1066	1045	1045
Total cold run time: 68226 ms
Total hot run time: 59325 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197417 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ad7cd6ac25c2bda21e64342dabdcde773a75982c, data reload: false

query1	1307	922	886	886
query2	6306	1893	1887	1887
query3	10853	4275	4465	4275
query4	33424	23322	23592	23322
query5	3659	456	451	451
query6	310	177	178	177
query7	3997	320	322	320
query8	278	220	237	220
query9	9524	2575	2581	2575
query10	480	263	266	263
query11	17864	15221	15125	15125
query12	166	102	103	102
query13	1574	437	415	415
query14	8662	7544	7603	7544
query15	281	194	187	187
query16	8126	493	496	493
query17	1737	627	616	616
query18	2176	330	323	323
query19	376	161	164	161
query20	126	125	120	120
query21	218	110	113	110
query22	4928	4560	4672	4560
query23	34933	33992	34160	33992
query24	11131	3000	3002	3000
query25	626	403	396	396
query26	1219	166	164	164
query27	2655	365	361	361
query28	7185	2161	2182	2161
query29	856	466	445	445
query30	265	165	165	165
query31	1032	834	839	834
query32	87	53	56	53
query33	768	292	295	292
query34	961	530	510	510
query35	858	721	716	716
query36	1086	928	926	926
query37	138	63	68	63
query38	4113	3948	4009	3948
query39	1519	1477	1469	1469
query40	214	107	109	107
query41	47	48	48	48
query42	122	104	103	103
query43	522	485	486	485
query44	1324	818	814	814
query45	189	179	179	179
query46	1154	733	721	721
query47	2048	1955	2021	1955
query48	459	357	343	343
query49	979	386	405	386
query50	837	429	429	429
query51	7440	7352	7359	7352
query52	104	92	91	91
query53	256	193	184	184
query54	1195	483	485	483
query55	82	77	78	77
query56	258	247	250	247
query57	1315	1233	1223	1223
query58	225	225	211	211
query59	3227	3059	2932	2932
query60	283	264	284	264
query61	116	106	105	105
query62	865	684	688	684
query63	228	191	187	187
query64	4391	671	665	665
query65	3327	3269	3258	3258
query66	793	304	308	304
query67	15787	15600	15449	15449
query68	4496	590	575	575
query69	443	277	303	277
query70	1125	1102	1134	1102
query71	341	258	250	250
query72	6349	3977	4124	3977
query73	744	350	353	350
query74	10357	9263	8989	8989
query75	3395	2655	2655	2655
query76	2611	1105	1056	1056
query77	383	267	264	264
query78	10573	9757	9581	9581
query79	1380	611	607	607
query80	1083	428	425	425
query81	538	219	218	218
query82	954	88	88	88
query83	232	140	143	140
query84	236	73	83	73
query85	1296	310	292	292
query86	390	281	299	281
query87	4401	4229	4259	4229
query88	3504	2405	2363	2363
query89	414	297	295	295
query90	1859	186	202	186
query91	158	121	120	120
query92	64	56	51	51
query93	1389	547	551	547
query94	855	307	274	274
query95	376	274	271	271
query96	615	288	280	280
query97	3373	3186	3257	3186
query98	220	212	196	196
query99	1534	1306	1310	1306
Total cold run time: 299279 ms
Total hot run time: 197417 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.18 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ad7cd6ac25c2bda21e64342dabdcde773a75982c, data reload: false

query1	0.04	0.03	0.02
query2	0.07	0.03	0.03
query3	0.23	0.07	0.07
query4	1.62	0.11	0.10
query5	0.52	0.50	0.52
query6	1.13	0.73	0.72
query7	0.04	0.01	0.02
query8	0.04	0.03	0.02
query9	0.58	0.52	0.50
query10	0.57	0.54	0.56
query11	0.14	0.10	0.11
query12	0.14	0.12	0.11
query13	0.61	0.60	0.60
query14	0.77	0.79	0.79
query15	0.84	0.82	0.82
query16	0.39	0.40	0.38
query17	1.00	1.02	1.05
query18	0.23	0.22	0.21
query19	1.96	1.87	1.85
query20	0.01	0.01	0.01
query21	15.39	0.58	0.57
query22	2.40	2.80	1.46
query23	16.90	1.00	0.85
query24	3.13	1.78	1.46
query25	0.20	0.26	0.06
query26	0.58	0.13	0.13
query27	0.06	0.03	0.05
query28	9.34	0.51	0.44
query29	12.56	3.18	3.16
query30	0.25	0.05	0.05
query31	2.86	0.39	0.37
query32	3.25	0.46	0.45
query33	2.91	3.02	3.01
query34	16.82	4.52	4.46
query35	4.53	4.51	4.50
query36	0.68	0.47	0.48
query37	0.09	0.06	0.06
query38	0.04	0.03	0.04
query39	0.03	0.03	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.28 s
Total hot run time: 30.18 s

@morrySnow morrySnow merged commit 04b4ad3 into branch-3.1 Jul 8, 2025
21 of 22 checks passed
@github-actions github-actions bot deleted the auto-pick-52887-branch-3.1 branch July 8, 2025 04:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants