[fix](pipeline) premature exit causing core dump during concurrent pr… #52850

Gabriel39 · 2025-07-07T03:06:12Z

…epare execution (#51492)

Issue Number: close #51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly submitted tasks are rejected and return early. However, previously submitted tasks may still be scheduled for execution later. This can lead to premature destruction of objects such as PipelineFragmentContext and TPipelineFragmentParams that are referenced by those tasks, resulting in null pointer exceptions during task execution and ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are completed before returning.

*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

…epare execution (apache#51492) Issue Number: close apache#51491 Problem Summary: When the queue of the FragmentMgrAsync thread pool is full, newly submitted tasks are rejected and return early. However, previously submitted tasks may still be scheduled for execution later. This can lead to premature destruction of objects such as PipelineFragmentContext and TPipelineFragmentParams that are referenced by those tasks, resulting in null pointer exceptions during task execution and ultimately causing a coredump. The pr policy is to wait until all previously submitted tasks are completed before returning. ``` *** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499 10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0 11# 0x0000FFFF6B16548C in /lib64/libc.so.6 ``` Co-authored-by: XLPE <weiwh1@chinatelecom.cn>

Thearas · 2025-07-07T03:06:17Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

Gabriel39 · 2025-07-07T03:06:25Z

run buildall

hello-stephen · 2025-07-07T04:18:25Z

BE UT Coverage Report

Increment line coverage 0.00% (0/24) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	39.07% (10345/26475)
Line Coverage	30.04% (85871/285837)
Region Coverage	28.72% (44314/154298)
Branch Coverage	25.42% (22669/89162)

github-actions · 2025-07-09T02:20:07Z

PR approved by at least one committer and no changes requested.

github-actions · 2025-07-09T02:20:10Z

PR approved by anyone and no changes requested.

Gabriel39 requested a review from yiguolei as a code owner July 7, 2025 03:06

yiguolei approved these changes Jul 9, 2025

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 9, 2025

github-actions bot added the reviewed label Jul 9, 2025

yiguolei merged commit 528eaad into apache:branch-2.1 Jul 9, 2025
20 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix](pipeline) premature exit causing core dump during concurrent pr… #52850

[fix](pipeline) premature exit causing core dump during concurrent pr… #52850

Uh oh!

Gabriel39 commented Jul 7, 2025

Uh oh!

Thearas commented Jul 7, 2025

Uh oh!

Gabriel39 commented Jul 7, 2025

Uh oh!

hello-stephen commented Jul 7, 2025

Uh oh!

github-actions bot commented Jul 9, 2025

Uh oh!

github-actions bot commented Jul 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[fix](pipeline) premature exit causing core dump during concurrent pr… #52850

[fix](pipeline) premature exit causing core dump during concurrent pr… #52850

Uh oh!

Conversation

Gabriel39 commented Jul 7, 2025

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Jul 7, 2025

Uh oh!

Gabriel39 commented Jul 7, 2025

Uh oh!

hello-stephen commented Jul 7, 2025

BE UT Coverage Report

Uh oh!

github-actions bot commented Jul 9, 2025

Uh oh!

github-actions bot commented Jul 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants