Skip to content

Conversation

@XLPE
Copy link
Contributor

@XLPE XLPE commented Jun 4, 2025

What problem does this PR solve?

Issue Number: close #51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly submitted tasks are rejected and return early. However, previously submitted tasks may still be scheduled for execution later. This can lead to premature destruction of objects such as PipelineFragmentContext and TPipelineFragmentParams that are referenced by those tasks, resulting in null pointer exceptions during task execution and ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are completed before returning.

*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jun 4, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@XLPE
Copy link
Contributor Author

XLPE commented Jun 4, 2025

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jun 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 4, 2025

PR approved by anyone and no changes requested.

@yiguolei yiguolei added usercase Important user case type label p0_c dev/2.1.x dev/3.0.x labels Jun 4, 2025
@doris-robot
Copy link

TPC-H: Total hot run time: 34934 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b2cf5dc887d941764b73b86e05b886c7c931d421, data reload: false

------ Round 1 ----------------------------------
q1	25965	5147	5450	5147
q2	1945	300	180	180
q3	10400	1259	726	726
q4	10225	1003	541	541
q5	7700	2406	2396	2396
q6	190	169	135	135
q7	950	727	628	628
q8	9317	1363	1163	1163
q9	6745	5080	5215	5080
q10	6821	2308	1900	1900
q11	507	307	273	273
q12	364	352	219	219
q13	17786	3728	3058	3058
q14	242	227	223	223
q15	556	499	509	499
q16	439	434	375	375
q17	634	900	383	383
q18	7784	7329	7177	7177
q19	1469	967	586	586
q20	361	342	231	231
q21	4073	3242	3042	3042
q22	1096	1014	972	972
Total cold run time: 115569 ms
Total hot run time: 34934 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5205	5095	5144	5095
q2	238	316	226	226
q3	2178	2666	2335	2335
q4	1439	1926	1435	1435
q5	4531	4413	4357	4357
q6	213	164	126	126
q7	1954	1918	1779	1779
q8	2622	2553	2544	2544
q9	7160	7126	7049	7049
q10	3039	3183	2743	2743
q11	574	516	499	499
q12	704	772	605	605
q13	3591	3948	3195	3195
q14	275	303	293	293
q15	554	504	484	484
q16	455	494	446	446
q17	1185	1545	1393	1393
q18	7634	7604	7406	7406
q19	842	909	1006	909
q20	1985	2060	1868	1868
q21	4872	4374	4382	4374
q22	1084	1064	1003	1003
Total cold run time: 52334 ms
Total hot run time: 50164 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185468 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b2cf5dc887d941764b73b86e05b886c7c931d421, data reload: false

query1	1019	502	497	497
query2	6575	1847	1801	1801
query3	6770	227	214	214
query4	25891	23721	23377	23377
query5	4323	604	446	446
query6	302	211	195	195
query7	4626	487	302	302
query8	263	226	216	216
query9	8620	2647	2657	2647
query10	483	335	276	276
query11	15299	15154	14806	14806
query12	162	113	111	111
query13	1661	522	413	413
query14	8873	6255	6300	6255
query15	199	197	179	179
query16	7285	643	429	429
query17	1213	747	592	592
query18	1975	412	307	307
query19	192	193	182	182
query20	125	124	116	116
query21	214	129	108	108
query22	4205	4150	3946	3946
query23	34289	32857	33101	32857
query24	8484	2384	2386	2384
query25	521	445	422	422
query26	1224	265	152	152
query27	2756	511	337	337
query28	4317	2168	2127	2127
query29	774	565	431	431
query30	292	211	181	181
query31	898	826	772	772
query32	70	66	63	63
query33	570	382	359	359
query34	799	857	529	529
query35	781	802	715	715
query36	923	991	890	890
query37	114	103	85	85
query38	4092	4096	4006	4006
query39	1473	1406	1445	1406
query40	214	121	109	109
query41	64	60	59	59
query42	128	107	108	107
query43	501	495	466	466
query44	1321	846	857	846
query45	178	171	169	169
query46	839	1020	637	637
query47	1758	1769	1688	1688
query48	386	416	309	309
query49	756	474	401	401
query50	652	687	404	404
query51	4146	4200	4061	4061
query52	104	106	99	99
query53	217	249	187	187
query54	588	568	495	495
query55	88	84	87	84
query56	318	301	273	273
query57	1106	1153	1071	1071
query58	266	292	250	250
query59	2571	2622	2524	2524
query60	330	315	337	315
query61	128	129	127	127
query62	801	776	679	679
query63	224	184	181	181
query64	4342	1000	666	666
query65	4242	4149	4095	4095
query66	1174	428	315	315
query67	15617	15403	15373	15373
query68	8827	879	533	533
query69	489	338	272	272
query70	1200	1133	1074	1074
query71	455	325	300	300
query72	5468	4672	4707	4672
query73	709	588	355	355
query74	9233	9140	8748	8748
query75	4281	3200	2677	2677
query76	3710	1216	752	752
query77	807	389	279	279
query78	10250	10398	9414	9414
query79	1753	813	589	589
query80	722	533	451	451
query81	479	260	261	260
query82	449	122	96	96
query83	291	258	235	235
query84	298	105	93	93
query85	778	360	325	325
query86	347	298	298	298
query87	4360	4467	4377	4377
query88	2904	2274	2326	2274
query89	376	321	279	279
query90	1913	210	211	210
query91	143	141	111	111
query92	75	58	61	58
query93	1131	946	574	574
query94	673	406	311	311
query95	378	290	290	290
query96	491	552	279	279
query97	2703	2766	2669	2669
query98	228	210	204	204
query99	1447	1393	1288	1288
Total cold run time: 273079 ms
Total hot run time: 185468 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.83 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit b2cf5dc887d941764b73b86e05b886c7c931d421, data reload: false

query1	0.04	0.03	0.03
query2	0.12	0.10	0.10
query3	0.28	0.19	0.19
query4	1.59	0.19	0.19
query5	0.46	0.44	0.43
query6	1.45	0.65	0.66
query7	0.02	0.02	0.02
query8	0.05	0.04	0.03
query9	0.59	0.52	0.52
query10	0.58	0.58	0.57
query11	0.15	0.11	0.12
query12	0.14	0.11	0.12
query13	0.62	0.60	0.60
query14	0.79	0.79	0.80
query15	0.88	0.84	0.88
query16	0.38	0.38	0.39
query17	1.06	1.04	1.05
query18	0.23	0.21	0.21
query19	1.94	1.89	1.84
query20	0.02	0.01	0.01
query21	15.41	0.94	0.57
query22	0.77	1.17	0.59
query23	15.04	1.42	0.65
query24	6.82	1.77	0.36
query25	0.35	0.26	0.14
query26	0.65	0.15	0.14
query27	0.05	0.05	0.06
query28	9.66	0.91	0.45
query29	12.55	4.06	3.34
query30	0.25	0.09	0.07
query31	2.82	0.59	0.39
query32	3.23	0.55	0.46
query33	3.03	3.10	3.18
query34	15.91	5.17	4.50
query35	4.51	4.58	4.57
query36	0.68	0.48	0.49
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.04	0.02	0.02
query40	0.17	0.14	0.13
query41	0.07	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 103.61 s
Total hot run time: 28.83 s

@XLPE
Copy link
Contributor Author

XLPE commented Jun 5, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33984 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e11e09697241c0152afcf5ad070f8932797d7e40, data reload: false

------ Round 1 ----------------------------------
q1	26215	5054	5022	5022
q2	1951	265	174	174
q3	10325	1255	699	699
q4	10232	1010	526	526
q5	7543	2330	2363	2330
q6	175	159	132	132
q7	884	733	606	606
q8	9290	1214	1094	1094
q9	6811	5226	5225	5225
q10	6887	2327	1892	1892
q11	486	293	273	273
q12	340	348	212	212
q13	17903	3659	3096	3096
q14	221	225	206	206
q15	552	483	484	483
q16	418	424	372	372
q17	610	846	368	368
q18	7521	7150	7137	7137
q19	1584	944	551	551
q20	348	348	229	229
q21	3758	3165	2383	2383
q22	1008	992	974	974
Total cold run time: 115062 ms
Total hot run time: 33984 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5135	5053	5045	5045
q2	234	318	222	222
q3	2199	2624	2352	2352
q4	1382	1927	1329	1329
q5	4471	4391	4391	4391
q6	211	163	131	131
q7	2004	1928	1770	1770
q8	2594	2525	2495	2495
q9	7231	7207	6968	6968
q10	3035	3232	2764	2764
q11	576	511	495	495
q12	665	772	624	624
q13	3517	3904	3257	3257
q14	299	298	275	275
q15	529	481	483	481
q16	452	494	427	427
q17	1131	1582	1345	1345
q18	7853	7601	7488	7488
q19	818	807	924	807
q20	2000	2031	1802	1802
q21	4820	4490	4540	4490
q22	1073	1070	1019	1019
Total cold run time: 52229 ms
Total hot run time: 49977 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192891 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e11e09697241c0152afcf5ad070f8932797d7e40, data reload: false

query1	1401	1097	1086	1086
query2	6248	1799	1814	1799
query3	11007	4630	4448	4448
query4	56276	25455	23628	23628
query5	5286	499	472	472
query6	367	202	192	192
query7	4986	559	289	289
query8	279	222	204	204
query9	6206	2641	2654	2641
query10	448	344	294	294
query11	15088	15068	14852	14852
query12	173	117	108	108
query13	1083	540	410	410
query14	10377	6591	6611	6591
query15	209	209	185	185
query16	7152	627	495	495
query17	1124	777	626	626
query18	1536	410	324	324
query19	203	216	181	181
query20	127	121	133	121
query21	206	123	114	114
query22	4262	4339	4183	4183
query23	34551	33668	33348	33348
query24	6563	2402	2504	2402
query25	476	485	409	409
query26	670	299	161	161
query27	2205	524	351	351
query28	2942	2171	2139	2139
query29	592	580	445	445
query30	275	228	195	195
query31	866	855	786	786
query32	70	70	85	70
query33	449	397	335	335
query34	787	878	548	548
query35	815	855	751	751
query36	953	996	895	895
query37	129	102	81	81
query38	4389	4210	4415	4210
query39	1527	1445	1438	1438
query40	224	131	110	110
query41	60	57	56	56
query42	123	106	108	106
query43	504	500	481	481
query44	1400	874	858	858
query45	181	173	184	173
query46	891	1051	670	670
query47	1850	1833	1773	1773
query48	426	450	327	327
query49	681	470	404	404
query50	704	738	428	428
query51	4215	4314	4253	4253
query52	111	121	111	111
query53	245	265	194	194
query54	610	590	525	525
query55	89	87	80	80
query56	334	325	305	305
query57	1160	1162	1104	1104
query58	274	287	260	260
query59	2626	2771	2578	2578
query60	357	319	320	319
query61	128	126	131	126
query62	746	778	657	657
query63	235	196	192	192
query64	1557	1072	696	696
query65	4240	4224	4155	4155
query66	713	397	303	303
query67	15763	15636	15202	15202
query68	7658	915	534	534
query69	541	362	277	277
query70	1177	1094	1112	1094
query71	505	331	304	304
query72	6040	4733	4778	4733
query73	1330	637	352	352
query74	8993	9044	9042	9042
query75	3752	3242	2713	2713
query76	4257	1237	813	813
query77	618	383	302	302
query78	10167	10161	9306	9306
query79	4773	817	590	590
query80	660	564	457	457
query81	550	264	225	225
query82	523	133	102	102
query83	383	255	233	233
query84	293	112	99	99
query85	800	363	315	315
query86	416	292	281	281
query87	4441	4456	4293	4293
query88	3426	2306	2283	2283
query89	441	404	295	295
query90	1900	220	214	214
query91	159	145	113	113
query92	73	65	56	56
query93	2746	954	578	578
query94	674	414	300	300
query95	376	295	294	294
query96	499	595	282	282
query97	2748	2711	2659	2659
query98	236	217	203	203
query99	1454	1417	1296	1296
Total cold run time: 305359 ms
Total hot run time: 192891 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.09 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e11e09697241c0152afcf5ad070f8932797d7e40, data reload: false

query1	0.03	0.04	0.03
query2	0.12	0.11	0.11
query3	0.25	0.19	0.19
query4	1.59	0.20	0.11
query5	0.43	0.41	0.44
query6	1.16	0.66	0.66
query7	0.02	0.02	0.02
query8	0.04	0.04	0.04
query9	0.59	0.52	0.52
query10	0.57	0.58	0.57
query11	0.16	0.11	0.10
query12	0.15	0.11	0.11
query13	0.62	0.60	0.60
query14	0.84	0.79	0.82
query15	0.87	0.87	0.88
query16	0.38	0.38	0.40
query17	1.04	1.05	1.04
query18	0.24	0.21	0.21
query19	1.93	1.81	1.81
query20	0.02	0.01	0.01
query21	15.39	0.89	0.58
query22	0.75	1.23	0.63
query23	14.96	1.39	0.60
query24	6.89	0.84	0.91
query25	0.52	0.11	0.07
query26	0.64	0.16	0.13
query27	0.05	0.05	0.06
query28	9.78	0.93	0.46
query29	12.54	4.04	3.32
query30	0.25	0.09	0.07
query31	2.81	0.60	0.39
query32	3.23	0.56	0.47
query33	3.05	3.14	3.14
query34	15.64	5.17	4.48
query35	4.50	4.53	4.50
query36	0.66	0.48	0.47
query37	0.08	0.06	0.07
query38	0.06	0.04	0.04
query39	0.03	0.02	0.02
query40	0.18	0.13	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 103.21 s
Total hot run time: 29.09 s

@XLPE
Copy link
Contributor Author

XLPE commented Jun 5, 2025

run beut

@XLPE
Copy link
Contributor Author

XLPE commented Jun 6, 2025

@yiguolei There are many failing nereids_function_p0 tests in BE UT, but these tests are unrelated to my current changes. Are these test failures expected?

@XLPE
Copy link
Contributor Author

XLPE commented Jun 16, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33758 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 551c46780b926acf8b3d86f0d9d36944da315206, data reload: false

------ Round 1 ----------------------------------
q1	17603	5200	5043	5043
q2	1942	291	183	183
q3	10373	1300	692	692
q4	10229	989	525	525
q5	7563	2414	2286	2286
q6	186	165	134	134
q7	912	727	614	614
q8	9313	1293	1047	1047
q9	6820	5117	5044	5044
q10	6901	2307	1892	1892
q11	482	289	271	271
q12	346	353	216	216
q13	17766	3654	3044	3044
q14	219	227	220	220
q15	558	478	469	469
q16	427	448	383	383
q17	616	874	396	396
q18	7468	7223	7165	7165
q19	1840	978	542	542
q20	331	341	240	240
q21	3847	3227	2383	2383
q22	1047	1033	969	969
Total cold run time: 106789 ms
Total hot run time: 33758 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5238	5038	5130	5038
q2	236	325	221	221
q3	2123	2692	2298	2298
q4	1391	1811	1327	1327
q5	4228	4135	4465	4135
q6	209	171	129	129
q7	2042	1974	1779	1779
q8	2592	2571	2594	2571
q9	7139	7085	7067	7067
q10	2998	3182	2766	2766
q11	588	513	494	494
q12	671	788	642	642
q13	3549	3882	3207	3207
q14	283	290	268	268
q15	520	500	474	474
q16	441	508	441	441
q17	1179	1588	1337	1337
q18	7655	7424	7429	7424
q19	808	809	940	809
q20	2024	2054	1874	1874
q21	4915	4501	4500	4500
q22	1063	1047	971	971
Total cold run time: 51892 ms
Total hot run time: 49772 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185527 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 551c46780b926acf8b3d86f0d9d36944da315206, data reload: false

query1	996	386	400	386
query2	6520	1853	1849	1849
query3	6745	234	222	222
query4	26750	23181	23470	23181
query5	4359	626	460	460
query6	302	228	200	200
query7	4632	506	294	294
query8	284	226	232	226
query9	8672	2623	2662	2623
query10	480	354	293	293
query11	15426	14993	14784	14784
query12	172	109	109	109
query13	1669	537	409	409
query14	8766	6246	6228	6228
query15	206	200	182	182
query16	7427	652	460	460
query17	1222	725	588	588
query18	2001	428	306	306
query19	197	203	168	168
query20	131	123	131	123
query21	215	127	116	116
query22	4018	4311	3925	3925
query23	34164	33001	33061	33001
query24	8404	2344	2380	2344
query25	523	451	388	388
query26	1266	271	155	155
query27	2729	497	332	332
query28	4359	2125	2102	2102
query29	761	585	429	429
query30	278	214	194	194
query31	936	847	739	739
query32	71	69	60	60
query33	574	357	316	316
query34	781	856	536	536
query35	785	815	743	743
query36	947	1005	896	896
query37	114	105	80	80
query38	4073	4116	4102	4102
query39	1506	1438	1560	1438
query40	204	117	110	110
query41	66	78	58	58
query42	127	119	115	115
query43	508	508	471	471
query44	1348	841	831	831
query45	184	174	166	166
query46	859	1021	623	623
query47	1743	1777	1706	1706
query48	399	435	307	307
query49	752	476	386	386
query50	669	668	410	410
query51	4153	4148	4106	4106
query52	108	104	98	98
query53	231	260	192	192
query54	569	569	516	516
query55	89	84	85	84
query56	314	309	293	293
query57	1159	1194	1109	1109
query58	263	257	256	256
query59	2606	2735	2499	2499
query60	332	321	313	313
query61	130	125	126	125
query62	820	713	662	662
query63	227	199	190	190
query64	4313	1047	678	678
query65	4264	4179	4150	4150
query66	1128	409	324	324
query67	15761	15567	15424	15424
query68	7992	893	522	522
query69	477	317	271	271
query70	1219	1158	1123	1123
query71	474	329	295	295
query72	5435	4681	4614	4614
query73	728	589	365	365
query74	9155	8778	8725	8725
query75	3898	3192	2715	2715
query76	3716	1196	764	764
query77	786	374	293	293
query78	10018	10115	9419	9419
query79	3110	782	579	579
query80	638	517	522	517
query81	490	249	217	217
query82	486	130	98	98
query83	288	253	233	233
query84	296	108	89	89
query85	779	350	315	315
query86	386	307	275	275
query87	4430	4552	4288	4288
query88	3382	2336	2281	2281
query89	399	318	288	288
query90	1856	218	216	216
query91	137	158	115	115
query92	77	64	64	64
query93	2545	934	570	570
query94	675	378	313	313
query95	380	306	299	299
query96	494	570	283	283
query97	2753	2711	2669	2669
query98	233	210	205	205
query99	1472	1426	1318	1318
Total cold run time: 275991 ms
Total hot run time: 185527 ms

@XLPE
Copy link
Contributor Author

XLPE commented Jun 17, 2025

run beut

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/24) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 56.38% (15048/26692)
Line Coverage 45.14% (134560/298074)
Region Coverage 44.26% (67677/152898)
Branch Coverage 38.83% (34717/89398)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 83.33% (20/24) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 79.76% (20967/26289)
Line Coverage 72.69% (216689/298081)
Region Coverage 70.96% (127703/179959)
Branch Coverage 64.60% (66058/102250)

@XLPE
Copy link
Contributor Author

XLPE commented Jun 17, 2025

run arm

@XLPE
Copy link
Contributor Author

XLPE commented Jun 17, 2025

run performance

@yiguolei yiguolei closed this Jun 20, 2025
@yiguolei yiguolei reopened this Jun 20, 2025
@yiguolei yiguolei merged commit 7722c74 into apache:master Jun 20, 2025
35 of 38 checks passed
Gabriel39 pushed a commit to Gabriel39/incubator-doris that referenced this pull request Jun 26, 2025
…epare execution (apache#51492)

Issue Number: close apache#51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly
submitted tasks are rejected and return early. However, previously
submitted tasks may still be scheduled for execution later. This can
lead to premature destruction of objects such as PipelineFragmentContext
and TPipelineFragmentParams that are referenced by those tasks,
resulting in null pointer exceptions during task execution and
ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are
completed before returning.

```
*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6
```

Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
Gabriel39 pushed a commit to Gabriel39/incubator-doris that referenced this pull request Jun 26, 2025
…epare execution (apache#51492)

Issue Number: close apache#51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly
submitted tasks are rejected and return early. However, previously
submitted tasks may still be scheduled for execution later. This can
lead to premature destruction of objects such as PipelineFragmentContext
and TPipelineFragmentParams that are referenced by those tasks,
resulting in null pointer exceptions during task execution and
ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are
completed before returning.

```
*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6
```

Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
dataroaring pushed a commit that referenced this pull request Jun 27, 2025
#52365)

…epare execution (#51492)

Issue Number: close #51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly
submitted tasks are rejected and return early. However, previously
submitted tasks may still be scheduled for execution later. This can
lead to premature destruction of objects such as PipelineFragmentContext
and TPipelineFragmentParams that are referenced by those tasks,
resulting in null pointer exceptions during task execution and
ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are
completed before returning.

```
*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6
```

Co-authored-by: XLPE <crykix@gmail.com>
Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
morrySnow pushed a commit that referenced this pull request Jun 27, 2025
…oncurrent prepare execution #51492 (#52364)

Cherry-pick from #51492


Co-authored-by: XLPE <crykix@gmail.com>
Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
koarz pushed a commit to koarz/doris that referenced this pull request Jul 3, 2025
apache#52365)

…epare execution (apache#51492)

Issue Number: close apache#51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly
submitted tasks are rejected and return early. However, previously
submitted tasks may still be scheduled for execution later. This can
lead to premature destruction of objects such as PipelineFragmentContext
and TPipelineFragmentParams that are referenced by those tasks,
resulting in null pointer exceptions during task execution and
ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are
completed before returning.

```
*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6
```

Co-authored-by: XLPE <crykix@gmail.com>
Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
Gabriel39 pushed a commit to Gabriel39/incubator-doris that referenced this pull request Jul 7, 2025
…epare execution (apache#51492)

Issue Number: close apache#51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly
submitted tasks are rejected and return early. However, previously
submitted tasks may still be scheduled for execution later. This can
lead to premature destruction of objects such as PipelineFragmentContext
and TPipelineFragmentParams that are referenced by those tasks,
resulting in null pointer exceptions during task execution and
ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are
completed before returning.

```
*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6
```

Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
yiguolei pushed a commit that referenced this pull request Jul 9, 2025
#52850)

…epare execution (#51492)

Issue Number: close #51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly
submitted tasks are rejected and return early. However, previously
submitted tasks may still be scheduled for execution later. This can
lead to premature destruction of objects such as PipelineFragmentContext
and TPipelineFragmentParams that are referenced by those tasks,
resulting in null pointer exceptions during task execution and
ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are
completed before returning.

```
*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6
```

Co-authored-by: XLPE <crykix@gmail.com>
Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.11-merged dev/3.0.7-merged dev/3.1.0-merged p0_c reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] premature exit causing core dump during concurrent prepare execution

8 participants