Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #57092

Now, when transaction fail, routine load task will retry as soon as
possible, when meet some temporarily unrecoverable errors like
`too_many_version`, it will retry too many times in a short time and
take huge pressure to upstream system like Kafka.

To solve this problem, we delay load task schedule when transaction fail
to reduce retry times when meet error and restore normal schedule if
transactions resume normal execution.
@github-actions github-actions bot requested a review from morrySnow as a code owner October 19, 2025 08:53
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Oct 19, 2025
@hello-stephen
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 35083 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 35986939e0acea8e3a756a7fb21465bed3a41ac7, data reload: false

------ Round 1 ----------------------------------
q1	17610	5543	5500	5500
q2	2062	418	341	341
q3	11947	1315	804	804
q4	10280	913	534	534
q5	8906	2494	2219	2219
q6	200	177	140	140
q7	936	797	652	652
q8	9380	1538	1321	1321
q9	5359	5090	4977	4977
q10	6820	2328	1865	1865
q11	518	321	298	298
q12	383	394	242	242
q13	17809	3744	3095	3095
q14	254	247	241	241
q15	552	510	499	499
q16	499	493	431	431
q17	736	930	472	472
q18	7321	6979	6914	6914
q19	1298	1033	622	622
q20	403	394	246	246
q21	3683	2865	2593	2593
q22	1181	1142	1077	1077
Total cold run time: 108137 ms
Total hot run time: 35083 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5588	5537	5576	5537
q2	267	351	257	257
q3	2458	2820	2400	2400
q4	1445	1873	1488	1488
q5	4784	5211	5116	5116
q6	179	173	140	140
q7	2173	2015	1888	1888
q8	2731	2912	2818	2818
q9	7459	7419	7308	7308
q10	3090	3370	2906	2906
q11	608	545	533	533
q12	715	801	661	661
q13	3558	3855	3319	3319
q14	321	315	290	290
q15	532	483	486	483
q16	487	521	465	465
q17	1320	1782	1342	1342
q18	7788	7748	7510	7510
q19	936	1281	1132	1132
q20	2098	2145	1997	1997
q21	5720	5192	4875	4875
q22	1177	1191	1139	1139
Total cold run time: 55434 ms
Total hot run time: 53604 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 35986939e0acea8e3a756a7fb21465bed3a41ac7, data reload: false

query1	0.04	0.04	0.04
query2	0.07	0.04	0.04
query3	0.24	0.07	0.07
query4	1.60	0.11	0.11
query5	0.53	0.52	0.55
query6	1.13	0.76	0.75
query7	0.02	0.02	0.02
query8	0.06	0.04	0.04
query9	0.62	0.56	0.55
query10	0.57	0.60	0.59
query11	0.16	0.12	0.11
query12	0.16	0.13	0.12
query13	0.64	0.62	0.60
query14	0.82	0.85	0.81
query15	0.90	0.86	0.87
query16	0.39	0.42	0.45
query17	1.09	1.08	1.06
query18	0.26	0.25	0.26
query19	2.00	1.90	2.16
query20	0.01	0.02	0.01
query21	15.36	1.16	0.68
query22	0.76	0.85	0.73
query23	14.91	1.62	0.59
query24	3.05	1.30	0.43
query25	0.23	0.15	0.07
query26	0.23	0.17	0.16
query27	0.05	0.05	0.05
query28	12.66	1.18	0.49
query29	12.60	4.79	3.92
query30	0.26	0.10	0.08
query31	2.82	0.66	0.43
query32	3.23	0.57	0.47
query33	3.12	3.12	3.11
query34	16.62	5.27	4.65
query35	4.69	4.58	4.64
query36	0.69	0.55	0.53
query37	0.09	0.06	0.06
query38	0.05	0.05	0.04
query39	0.04	0.03	0.03
query40	0.17	0.14	0.13
query41	0.09	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 103.12 s
Total hot run time: 30.2 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 44.44% (4/9) 🎉
Increment coverage report
Complete coverage report

@morrySnow morrySnow merged commit 60e1959 into branch-3.1 Oct 28, 2025
23 checks passed
@morrySnow morrySnow deleted the auto-pick-57092-branch-3.1 branch October 28, 2025 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants