Skip to content

Conversation

@Gabriel39
Copy link
Contributor

…epare execution (#51492)

Issue Number: close #51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly submitted tasks are rejected and return early. However, previously submitted tasks may still be scheduled for execution later. This can lead to premature destruction of objects such as PipelineFragmentContext and TPipelineFragmentParams that are referenced by those tasks, resulting in null pointer exceptions during task execution and ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are completed before returning.

*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…epare execution (apache#51492)

Issue Number: close apache#51491

Problem Summary:
When the queue of the FragmentMgrAsync thread pool is full, newly
submitted tasks are rejected and return early. However, previously
submitted tasks may still be scheduled for execution later. This can
lead to premature destruction of objects such as PipelineFragmentContext
and TPipelineFragmentParams that are referenced by those tasks,
resulting in null pointer exceptions during task execution and
ultimately causing a coredump.

The pr policy is to wait until all previously submitted tasks are
completed before returning.

```
*** SIGSEGV address not mapped to object (@0x1c8) received by PID 3941201 (TID 2115617 OR 0xfe1685bb97f0) from PID 456; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/jdk64/current/jre/lib/aarch64/server/libjvm.so
 4# 0x0000FFFF6B2A07C0 in linux-vdso.so.1
 5# doris::TUniqueId::TUniqueId(doris::TUniqueId const&) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/gensrc/build/gen_cpp/Types_types.cpp:2354
 6# doris::AttachTask::AttachTask(doris::QueryContext*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/runtime/thread_context.cpp:60
 7# std::_Function_handler<void (), doris::pipeline::PipelineXFragmentContext::_build_pipeline_x_tasks(doris::TPipelineFragmentParams const&, doris::ThreadPool*)::$_0>::_M_invoke(std::_Any_data const&) at /usr/lib/gcc/aarch64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290
 8# doris::ThreadPool::dispatch_thread() at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/threadpool.cpp:552
 9# doris::Thread::supervise_thread(void*) at /home/jenkins_agent/workspace/BigDataComponent_doris-unified-arm-release/be/src/util/thread.cpp:499
10# 0x0000FFFF6AF187AC in /lib64/libpthread.so.0
11# 0x0000FFFF6B16548C in /lib64/libc.so.6
```

Co-authored-by: XLPE <weiwh1@chinatelecom.cn>
@Gabriel39 Gabriel39 requested a review from yiguolei as a code owner July 7, 2025 03:06
@Thearas
Copy link
Contributor

Thearas commented Jul 7, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Gabriel39
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/24) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 39.07% (10345/26475)
Line Coverage 30.04% (85871/285837)
Region Coverage 28.72% (44314/154298)
Branch Coverage 25.42% (22669/89162)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 9, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jul 9, 2025

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 9, 2025

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 528eaad into apache:branch-2.1 Jul 9, 2025
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants