Add Linux support for ProcessStartInfo.KillOnParentExit using prctl(PR_SET_PDEATHSIG)#127112
Add Linux support for ProcessStartInfo.KillOnParentExit using prctl(PR_SET_PDEATHSIG)#127112
Conversation
…R_SET_PDEATHSIG)
Implements PR_SET_PDEATHSIG-based KillOnParentExit on Linux using a dedicated
long-lived thread to ensure the death signal fires on process exit, not thread exit.
Changes:
- Add HAVE_PR_SET_PDEATHSIG cmake check and config
- Add killOnParentExit parameter to SystemNative_ForkAndExecProcess
- Implement dedicated pdeathsig thread with mutex/condvar synchronization
- Call prctl(PR_SET_PDEATHSIG, SIGKILL) in child after fork
- Pass KillOnParentExit through managed interop layer
- Add [SupportedOSPlatform("linux")] to ref and impl
- Enable KillOnParentExitTests on Linux
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/85b53068-8457-4ee3-bdbf-4339fc5a007e
Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…necessary prctl in thread, improve getppid comment Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/85b53068-8457-4ee3-bdbf-4339fc5a007e Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
|
@copilot fix following build failure: |
Add return NULL after infinite while(1) loop to satisfy GCC's -Werror=return-type, with clang pragma to suppress -Wunreachable-code-return. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/2659e9ab-8d2c-48d8-a7af-d5356d056e46 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
There was a problem hiding this comment.
@copilot fix the WASM build
FAILED: libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o
/__w/1/s/src/mono/browser/emsdk/emscripten/emcc -DCOMPILER_SUPPORTS_W_RESERVED_IDENTIFIER -DDISABLE_CONTRACTS -DGEN_PINVOKE=1 -DHOST_32BIT=1 -DHOST_BROWSER -DHOST_UNIX -DHOST_WASM -DNDEBUG -DTARGET_32BIT -DTARGET_BROWSER -DTARGET_UNIX -DTARGET_WASM -DTARGET_WASM32 -DURTBLDENV_FRIENDLY=Retail -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 -I/__w/1/s/artifacts/obj/coreclr/browser.wasm.Release/libs-native/System.Native -I/__w/1/s/src/native/libs/System.Native -I/__w/1/s/src/native -I/__w/1/s/src/native/inc -std=gnu99 -Wno-declaration-after-statement -Wno-pre-c11-compat -O3 -DNDEBUG -std=gnu11 -fwasm-exceptions -mbulk-memory -msimd128 -O3 -Wall -Wno-null-conversion -glldb -Wno-unused-parameter -Wno-alloca -Wno-implicit-int-float-conversion -fno-omit-frame-pointer -fno-strict-overflow -fno-strict-aliasing -Werror -Wno-unused-variable -Wno-unused-value -Wno-unused-function -Wno-tautological-compare -Wno-unknown-pragmas -Wimplicit-fallthrough -Wno-unused-but-set-variable -ffp-contract=off -Wno-unknown-warning-option -ferror-limit=4096 -Wno-unused-private-field -Wno-constant-logical-operand -Wno-pragma-pack -Wno-incompatible-ms-struct -Wno-reserved-identifier -Wno-unsafe-buffer-usage -Wno-single-bit-bitfield-constant-conversion -Wno-cast-function-type-strict -Wno-switch-default -fsigned-char -fvisibility=hidden -ffunction-sections -I/__w/1/s/src/native/libs/Common -I/__w/1/s/artifacts/obj/coreclr/browser.wasm.Release/libs-native/Common -Oz -Weverything -Wno-format-nonliteral -Wno-disabled-macro-expansion -Wno-padded -Wno-empty-translation-unit -Wno-cast-align -Wno-typedef-redefinition -Wno-thread-safety-analysis -Wno-c11-extensions -Wno-pre-c11-compat -MD -MT libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o -MF libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o.d -o libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o -c /__w/1/s/src/native/libs/System.Native/pal_process.c
/__w/1/s/src/native/libs/System.Native/pal_process.c:537:12: error: call to undeclared function 'ForkAndExecProcessInternal'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
537 | return ForkAndExecProcessInternal(
| ^
/__w/1/s/src/native/libs/System.Native/pal_process.c:544:16: error: static declaration of 'ForkAndExecProcessInternal' follows non-static declaration
544 | static int32_t ForkAndExecProcessInternal(
| ^
/__w/1/s/src/native/libs/System.Native/pal_process.c:537:12: note: previous implicit declaration is here
537 | return ForkAndExecProcessInternal(
| ^
2 errors generated.
Move ForkAndExecProcessInternal forward declaration outside #if HAVE_PR_SET_PDEATHSIG so it's visible on all platforms (fixes WASM build). Simplify return NULL in PDeathSigThreadFunc by removing pragma guards. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0261d155-fd8a-43c3-80bc-9085dec73b97 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…flag Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/9df92df5-eb9e-4e09-80ab-f2b4640737e4 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
Fixed in 0f92552. The race was that concurrent callers could overwrite
🤖 Copilot Code Review — PR #127112Note This review was AI/Copilot-generated using the code-review skill, with sub-agent analysis from Gemini 2.5 Pro and GPT-5.3-Codex. Holistic AssessmentMotivation: The PR extends Approach: The dedicated long-lived thread design is the correct approach. Since Summary: ✅ LGTM. All previous review findings have been addressed, including the concurrent-submit race (fixed in 0f92552). The code is correct and well-structured. API shape matches the approved proposal. The advisory items below are theoretical edge cases for follow-up consideration. Detailed Findings✅ API shape matches approval — Verified correctThe approved API (comment by @bartonjs on #125838) specifies ✅ Thread safety of mutex/condvar pattern — Verified correct (flagged by both models)All shared state ( ✅ Double-checked locking — Verified correct (flagged by both models)
✅ vfork safety — Verified correct
✅ Orphan detection — Verified correctCapturing ✅ Stack-allocated request is safeGemini flagged sharing stack-allocated ✅ Platform guard — Verified correct
✅ Error handling follows established pattern
💡 Test coverage for concurrency (advisory, follow-up)Both models suggested adding a stress test for concurrent 💡 Worker thread resilience (advisory, follow-up)GPT noted that if the dedicated thread terminates unexpectedly, |
|
@copilot invoke the code-review skill and post your analysis/results as a comment on this PR |
🤖 Copilot Code Review — PR #127112Note This review was AI/Copilot-generated using the code-review skill, with sub-agent analysis from Gemini 2.5 Pro and GPT-5.3-Codex. Holistic AssessmentMotivation: The PR extends Approach: The dedicated long-lived thread design is the correct approach. Since Summary: ✅ LGTM. All previous review findings have been addressed, including the concurrent-submit race (fixed in 0f92552). The code is correct and well-structured. API shape matches the approved proposal. Both sub-agent models verified the core correctness. Detailed Findings✅ API approval — VerifiedThe ✅ Thread safety of double-checked locking — Verified correct (both models agree)
✅ Mutex/condvar pattern — Verified correct (both models agree)The producer-consumer pattern between
✅ prctl placement — Verified correctOne sub-agent flagged that ✅ vfork safety — Verified correct (both models agree)
✅ Orphan check (
|
|
@jkotas the PR is ready for the first round of review. Please let me know what you think about this approach. |
The approach looks fine to me. |
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/4ca5bf12-28f7-4dc4-8f03-a60786fa9bfe Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
🤖 Copilot Code Review — PR #127112Note This review was AI/Copilot-generated using the code-review skill, with sub-agent analysis from Gemini 2.5 Pro and GPT-5.3-Codex. Holistic AssessmentMotivation: The PR extends Approach: The dedicated long-lived thread design is the correct approach. Since Summary: ✅ LGTM. All previous review findings have been addressed. The code is correct and well-structured. API shape matches the approved proposal. Both sub-agent models verified the core correctness. Detailed Findings✅ API approval — VerifiedThe ✅ Thread safety of double-checked locking — Verified correct (both models agree)
✅ Mutex/condvar pattern — Verified correct (both models agree)
✅ vfork safety — Verified correct
✅ Orphan detection — Verified correctCaptures ✅ Error handling in
|
Contributes to #101985
Approved API: #125838 (comment)
Description
Extends
ProcessStartInfo.KillOnParentExitwith Linux support usingprctl(PR_SET_PDEATHSIG, SIGKILL).On Linux,
PR_SET_PDEATHSIGsends the death signal when the thread that calledprctlexits, not when the process exits. To ensure the signal fires only on process exit, this implementation creates a dedicated long-lived thread that performs all fork+exec operations whenKillOnParentExitis set. The child process callsprctl(PR_SET_PDEATHSIG, SIGKILL)after fork, before exec. Afterprctl, the child verifies it hasn't been reparented (e.g., to a subreaper in a container) by comparinggetppid()against the expected parent PID captured before fork.Changes
Native layer (
pal_process.c,pal_process.h,pal_process_wasi.c)killOnParentExitparameter toSystemNative_ForkAndExecProcessprctl(PR_SET_PDEATHSIG, SIGKILL)after fork, before exec_Atomic intwithatomic_load_explicit/atomic_store_explicitfor double-checked lockinggetppid()matches after prctl, correctly detecting reparenting to subreapersForkAndExecProcessInternalforward declaration outside#if HAVE_PR_SET_PDEATHSIGso it is visible on all platforms (including WASM)pthread_cond_broadcast(notpthread_cond_signal) for done condition to prevent lost-wakeup deadlock when multiple callers wait concurrentlypthread_attr_initandpthread_attr_setdetachstatereturn values, with proper cleanup on failure (following the established pattern frompal_signal.c)expectedParentPiddeclaration in the variable declarations block before the firstgototo fix C++-Wjump-misses-initbuild errors_pdeathsig_request == NULL) before submitting, and each request has a per-requestdoneflag so callers wait on their own completion rather than a shared pointer becoming NULLassert(s_pdeathsig_request == NULL)before submitting request to the dedicated threadCMake / Config
HAVE_PR_SET_PDEATHSIGcheck inconfigure.cmake(checkssys/prctl.h)#cmakedefine01 HAVE_PR_SET_PDEATHSIGinpal_config.h.inManaged layer
killOnParentExitparameter toInterop.ForkAndExecProcess.csP/InvokestartInfo.KillOnParentExitthroughSafeProcessHandle.Unix.cs, gated withOperatingSystem.IsLinux()to passfalseon non-Linux Unix platforms[SupportedOSPlatform("linux")]to ref and implementationTests
KillOnParentExitTests.csto run onTestPlatforms.Windows | TestPlatforms.Linux