[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler#123735
Merged
mdh1418 merged 6 commits intodotnet:mainfrom Feb 13, 2026
Merged
[CoreCLR][Signal] Bump shutdown notif and crashdump before prev handler#123735mdh1418 merged 6 commits intodotnet:mainfrom
mdh1418 merged 6 commits intodotnet:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adjusts CoreCLR’s signal chaining so shutdown notification and crash dump creation happen before invoking a previously-registered signal handler, ensuring these steps still run when the prior handler doesn’t return (notably on Android).
Changes:
- Reorders
PROCNotifyProcessShutdownandPROCCreateCrashDumpIfEnabledto run before chaining to the prior sigaction handler. - Adds an assertion to document/enforce that the “custom handler” path isn’t reached for
SIG_DFL/SIG_IGN.
jkotas
reviewed
Jan 28, 2026
This was referenced Jan 29, 2026
Open
Contributor
|
@mdh1418 it doesn't really matter what handlers Android installs, as long as you chain up to any you've captured. |
…ev handler" This reverts commit 5be9bbf.
jkotas
reviewed
Feb 11, 2026
jkotas
reviewed
Feb 11, 2026
…arsing - Switch from parsing CRASH_CHAINING property before PAL init to using Configuration::GetKnobBooleanValue after InitializeConfigurationKnobs - Add INTERNAL_CrashReportBeforeSignalChaining to clrconfigvalues.h - Rename API to PAL_EnableCrashReportBeforeSignalChaining for clarity - Remove HOST_PROPERTY_CRASH_CHAINING (no longer needed) - Remove CLRConfigNoCache::Get from signal.cpp (use standard config system)
jkotas
reviewed
Feb 12, 2026
- Rename configuration key for standardization - Allow PROCNotifyProcessShutdown when crash reporting before signal chaining - Conform to PAL formatting
jkotas
approved these changes
Feb 13, 2026
lateralusX
approved these changes
Feb 13, 2026
Member
lateralusX
left a comment
There was a problem hiding this comment.
LGTM, we can do the NAOT work in follow up PR.
richlander
pushed a commit
to richlander/runtime
that referenced
this pull request
Feb 14, 2026
…er (dotnet#123735) From discussion, opting into enabling the crash chaining is more correct. <s>The previously registered signal action/handler aren't guaranteed to return, so we lose out on notifying shutdown and creating a dump in those cases. Specifically, PROCCreateCrashDumpIfEnabled would be the last chance to provide the managed context for the thread that crashed. e.g. On Android CoreCLR, it seems that, by default, signal handlers are already registered by Android's runtime (/apex/com.android.runtime/bin/linker64 + /system/lib64/libandroid_runtime.so). Whenever an unhandled synchronous fault occurs, the previously registered handler will not return back to invoke_previous_action and aborts the thread itself, so PROCCreateCrashDumpIfEnabled will not be hit.</s> ## Sigsegv behavior Android CoreCLR vs other platforms ### Android CoreCLR When intentionally writing to NULL (sigsegv) on Android CoreCLR, the previously registered signal handler goes down this path https://github.com/dotnet/runtime/blob/40e8c73b8f3b5f478a9bf03cf55c71d0608a8855/src/coreclr/pal/src/exception/signal.cpp#L454, and the thread aborts before hitting PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled. ### MacOS/Linux/NativeAOT(linux) On MacOS, Linux, NativeAOT (Only checked linux at time of writing), the same intentional SIGSEGV will hit https://github.com/dotnet/runtime/blob/40e8c73b8f3b5f478a9bf03cf55c71d0608a8855/src/coreclr/pal/src/exception/signal.cpp#L431-L448 instead because there is no previously registered signal handler. In those cases, PROCCreateCrashDumpIfEnabled is hit and managed callstacks are captured in the dump. ## History investigation From a github history dive, I didn't spot anything in particular requiring the previous signal handler to be invoked before PROCNotifyProcessShutdown + PROCCreateCrashDumpIfEnabled. PROCNotifyProcessShutdown was first introduced in dotnet@1433c3f. It doesn't seem to state a particular reason for invoking it after the previous signal handler. PROCCreateCrashDumpIfEnabled was added to signal.cpp in dotnet@7f9bd2c because the PROCNotifyProcessShutdown didn't create a crash dump. It doesn't state any particular reason for being invoked after the previously registered signal handler, and was probably just placed next to PROCNotifyProcessShutdown. `invoke_previous_action` was introduced in dotnet@a740f65 and was refactoring while maintaining the order. ## Android CoreCLR behavior after swapping order Locally, I have POC changes to emit managed callstacks in Android's PROCCreateCrashDumpIfEnabled. ``` 01-28 17:26:40.951 2416 2440 F DOTNET : Native crash detected; attempting managed stack trace. 01-28 17:26:40.951 2416 2440 F DOTNET : {"stack":[ 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x0","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"}, 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x78d981145973","module":"0x0","offset":"0x0","name":"Program.MemSet(Void*, Int32, UIntPtr)"}, 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x78d981145973","module":"0x0","offset":"0x73","name":"Program.ForceNativeSegv()"}, 01-28 17:26:40.951 2416 2440 F DOTNET : {"ip":"0x78d981141b60","module":"0x0","offset":"0x70","name":"Program.Main(System.String[])"} 01-28 17:26:40.951 2416 2440 F DOTNET : ]} 01-28 17:26:40.952 2416 2440 F DOTNET : Crash dump hook completed. --------- beginning of crash 01-28 17:26:40.952 2416 2440 F libc : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 2440 (.dot.MonoRunner), pid 2416 (ulator.JIT.Test) ..... 01-28 17:26:46.882 2921 2921 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 01-28 17:26:46.882 2921 2921 F DEBUG : Build fingerprint: 'google/sdk_gphone64_x86_64/emu64xa:16/BE2A.250530.026.D1/13818094:user/release-keys' 01-28 17:26:46.882 2921 2921 F DEBUG : Revision: '0' 01-28 17:26:46.882 2921 2921 F DEBUG : ABI: 'x86_64' 01-28 17:26:46.882 2921 2921 F DEBUG : Timestamp: 2026-01-28 17:26:41.492831700-0500 01-28 17:26:46.882 2921 2921 F DEBUG : Process uptime: 20s 01-28 17:26:46.883 2921 2921 F DEBUG : Cmdline: net.dot.Android.Device_Emulator.JIT.Test 01-28 17:26:46.883 2921 2921 F DEBUG : pid: 2416, tid: 2440, name: .dot.MonoRunner >>> net.dot.Android.Device_Emulator.JIT.Test <<< 01-28 17:26:46.883 2921 2921 F DEBUG : uid: 10219 01-28 17:26:46.883 2921 2921 F DEBUG : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000000 01-28 17:26:46.883 2921 2921 F DEBUG : Cause: null pointer dereference 01-28 17:26:46.883 2921 2921 F DEBUG : Abort message: 'CoreCLR: previous handler for ' 01-28 17:26:46.883 2921 2921 F DEBUG : rax 0000000000000000 rbx 000078da87ffade0 rcx 0000000000000000 rdx 0000000000000001 01-28 17:26:46.884 1237 1297 I s.nexuslauncher: AssetManager2(0x78dd08cd9178) locale list changing from [] to [en-US] 01-28 17:26:46.903 2447 2594 I BugleNotifications: Creating notification input ids [CONTEXT im_entry_input="" im_notification_input="" im_settings_store_input="" im_final_input="" ] 01-28 17:26:46.905 2921 2921 F DEBUG : r8 00007ffcde5a8080 r9 34d9bb0e67871eb0 r10 000078ddb4111870 r11 0000000000000293 01-28 17:26:46.906 2921 2921 F DEBUG : r12 0000000000000001 r13 000078da87ffafa0 r14 0000000000000000 r15 000078da87ffaf18 01-28 17:26:46.906 2921 2921 F DEBUG : rdi 0000000000000000 rsi 0000000000000000 01-28 17:26:46.906 2921 2921 F DEBUG : rbp 000078da87ffac40 rsp 000078da87ffabc8 rip 000078ddb41118a2 01-28 17:26:46.906 2921 2921 F DEBUG : 2 total frames 01-28 17:26:46.906 2921 2921 F DEBUG : backtrace: 01-28 17:26:46.906 2921 2921 F DEBUG : #00 pc 000000000008f8a2 /apex/com.android.runtime/lib64/bionic/libc.so (memset_avx2+50) (BuildId: fcb82240218d1473de1e3d2137c0be35) 01-28 17:26:46.906 2921 2921 F DEBUG : dotnet#1 pc 0000000000049972 /memfd:doublemapper (deleted) (offset 0x111000) ``` Now theres a window to log managed callstacks before the original signal handler aborts and triggers a tombstone. ## Android Mono behavior Mono provides two embeddings APIs to configure signal and crash chaining https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/driver.c#L2864-L2894 that determine whether synchronous faults would chain https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-runtime.c#L3892-L3903 They would only chain to the previous signal handler https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-posix.c#L193-L210 only after attempting to walk native and managed stacks https://github.com/dotnet/runtime/blob/61d3943de41e948bb0ecf871b92eb456d2dd74d8/src/mono/mono/mini/mini-exceptions.c#L2992-L3012 ## Alternatives If there is any particular reason to preserve the order of sa_sigaction/sa_handler with respect to PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled for CoreCLR, a config knob can be added to allow Android CoreCLR to opt into the swapped ordering behavior. This may be in the form of config property key/values https://github.com/dotnet/runtime/blob/54ca569eb62800cdb725d776e3dd2e564028594d/src/coreclr/dlls/mscoree/exports.cpp#L237-L238 or `clrconfigvalues`. That way AndroidSDK/AndroidAppBuilder may opt-in at build-time. Given that the history of the ordering didn't reveal any problems with swapping the order, we can fallback to this behavior if the order swap causes problems down the line. The other way around is more restrictive. Should we first introduce all the overhead to enable an opt-in/opt-out config knob, and later discover that no platforms need to invoke their previous handlers before PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled, it seems harder to justify removing the knob.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
From discussion, opting into enabling the crash chaining is more correct.
The previously registered signal action/handler aren't guaranteed to return, so we lose out on notifying shutdown and creating a dump in those cases. Specifically, PROCCreateCrashDumpIfEnabled would be the last chance to provide the managed context for the thread that crashed.e.g. On Android CoreCLR, it seems that, by default, signal handlers are already registered by Android's runtime (/apex/com.android.runtime/bin/linker64 + /system/lib64/libandroid_runtime.so). Whenever an unhandled synchronous fault occurs, the previously registered handler will not return back to invoke_previous_action and aborts the thread itself, so PROCCreateCrashDumpIfEnabled will not be hit.Sigsegv behavior Android CoreCLR vs other platforms
Android CoreCLR
When intentionally writing to NULL (sigsegv) on Android CoreCLR, the previously registered signal handler goes down this path
runtime/src/coreclr/pal/src/exception/signal.cpp
Line 454 in 40e8c73
MacOS/Linux/NativeAOT(linux)
On MacOS, Linux, NativeAOT (Only checked linux at time of writing), the same intentional SIGSEGV will hit
runtime/src/coreclr/pal/src/exception/signal.cpp
Lines 431 to 448 in 40e8c73
History investigation
From a github history dive, I didn't spot anything in particular requiring the previous signal handler to be invoked before PROCNotifyProcessShutdown + PROCCreateCrashDumpIfEnabled.
PROCNotifyProcessShutdown was first introduced in 1433c3f. It doesn't seem to state a particular reason for invoking it after the previous signal handler.
PROCCreateCrashDumpIfEnabled was added to signal.cpp in 7f9bd2c because the PROCNotifyProcessShutdown didn't create a crash dump. It doesn't state any particular reason for being invoked after the previously registered signal handler, and was probably just placed next to PROCNotifyProcessShutdown.
invoke_previous_actionwas introduced in a740f65 and was refactoring while maintaining the order.Android CoreCLR behavior after swapping order
Locally, I have POC changes to emit managed callstacks in Android's PROCCreateCrashDumpIfEnabled.
Now theres a window to log managed callstacks before the original signal handler aborts and triggers a tombstone.
Android Mono behavior
Mono provides two embeddings APIs to configure signal and crash chaining
runtime/src/mono/mono/mini/driver.c
Lines 2864 to 2894 in 61d3943
runtime/src/mono/mono/mini/mini-runtime.c
Lines 3892 to 3903 in 61d3943
runtime/src/mono/mono/mini/mini-posix.c
Lines 193 to 210 in 61d3943
runtime/src/mono/mono/mini/mini-exceptions.c
Lines 2992 to 3012 in 61d3943
Alternatives
If there is any particular reason to preserve the order of sa_sigaction/sa_handler with respect to PROCNotifyProcessShutdown and PROCCreateCrashDumpIfEnabled for CoreCLR, a config knob can be added to allow Android CoreCLR to opt into the swapped ordering behavior. This may be in the form of config property key/values
runtime/src/coreclr/dlls/mscoree/exports.cpp
Lines 237 to 238 in 54ca569
clrconfigvalues. That way AndroidSDK/AndroidAppBuilder may opt-in at build-time.Given that the history of the ordering didn't reveal any problems with swapping the order, we can fallback to this behavior if the order swap causes problems down the line.
The other way around is more restrictive. Should we first introduce all the overhead to enable an opt-in/opt-out config knob, and later discover that no platforms need to invoke their previous handlers before PROCNotifyProcessShutdown/PROCCreateCrashDumpIfEnabled, it seems harder to justify removing the knob.