[release/10.0] Fix lookup for current Thread in async signal handler#124308
Merged
steveisok merged 1 commit intodotnet:release/10.0from Feb 16, 2026
Merged
Conversation
…2513) This is a new version of the fix. The code in HijackCallback was using the fact whether the pThreadToHijack is NULL or not as an indicator of whether it should suspend inline or redirect the thread. So passing in the pThreadToHijack without other changes caused it to never suspend inline on Unix and it was causing hangs in System.Collections.Concurrent tests. The current CheckActivationSafePoint uses thread local storage to get the current Thread instance. But this function is called from async signal handler (the activation signal handler) and it is not allowed to access TLS variables there because the access can allocate and if the interrupted code was running in an allocation code, it could crash. There was no problem with this since .NET 1.0, but a change in the recent glibc version has broken this. We've got reports of crashes in this code due to the reason mentioned above. This change introduces an async safe mechanism for accessing the current Thread instance from async signal handlers. It uses a segmented array that can grow, but never shrink. Entries for threads are added when runtime creates a thread / attaches to an external thread and removed when the thread dies. The check for safety of the activation injection was further enhanced to make sure that the ScanReaderLock is not taken. In cases it would need to be taken, we just reject the location. Since NativeAOT is subject to the same issue, the code to maintain the thread id to thread instance map is placed to the minipal and shared between coreclr and NativeAOT. Closes dotnet#121581 --------- Co-authored-by: Jan Kotas <jkotas@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Contributor
|
Tagging subscribers to this area: @agocke |
Contributor
There was a problem hiding this comment.
Pull request overview
This is a backport of #122513 to release/10.0 that fixes a crash caused by accessing Thread Local Storage (TLS) from async signal handlers. The change introduces an async-safe, lock-free thread map that can be safely accessed from signal handlers, replacing the previous TLS-based thread lookup that could crash when interrupted during memory allocation.
Changes:
- Introduces a lock-free, async-safe thread map implementation shared between CoreCLR and NativeAOT for Unix platforms (excluding WASM)
- Adds
minipal_get_current_thread_id_no_cache()to retrieve thread IDs without TLS access - Updates signal handlers and thread suspension logic to use the new async-safe lookup mechanism
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/native/minipal/thread.h | Adds minipal_get_current_thread_id_no_cache() function that retrieves thread IDs without using TLS, safe for async signal handlers |
| src/coreclr/runtime/asyncsafethreadmap.h | Header for new async-safe thread map API (shared between CoreCLR and NativeAOT) |
| src/coreclr/runtime/asyncsafethreadmap.cpp | Implements lock-free segmented hash table for thread lookup in signal handlers |
| src/coreclr/vm/threads.h | Declares GetThreadAsyncSafe() function for async-safe thread lookup |
| src/coreclr/vm/threads.cpp | Implements GetThreadAsyncSafe() and integrates thread map management into SetThread() lifecycle |
| src/coreclr/vm/threadsuspend.cpp | Updates CheckActivationSafePoint() to use async-safe thread lookup and avoid taking reader locks |
| src/coreclr/vm/codeman.h | Adds IsManagedCodeNoLock() and updates GetScanFlags() signature |
| src/coreclr/vm/codeman.cpp | Implements IsManagedCodeNoLock() for use when reader lock cannot be acquired |
| src/coreclr/vm/CMakeLists.txt | Adds asyncsafethreadmap files to build for Unix (non-WASM) targets |
| src/coreclr/pal/src/exception/signal.cpp | Refactors signal handler to always call original handler regardless of activation success |
| src/coreclr/nativeaot/Runtime/threadstore.h | Declares GetCurrentThreadIfAvailableAsyncSafe() for NativeAOT |
| src/coreclr/nativeaot/Runtime/threadstore.cpp | Integrates thread map management into attach/detach and implements async-safe lookup |
| src/coreclr/nativeaot/Runtime/thread.h | Updates HijackCallback() signature to include doInlineSuspend parameter |
| src/coreclr/nativeaot/Runtime/thread.cpp | Updates HijackCallback() to use explicit parameter instead of NULL check for inline suspend decision |
| src/coreclr/nativeaot/Runtime/unix/PalUnix.cpp | Updates signal handler to use async-safe thread lookup and pass explicit doInlineSuspend parameter |
| src/coreclr/nativeaot/Runtime/windows/PalMinWin.cpp | Updates APC handlers to pass explicit doInlineSuspend parameter |
| src/coreclr/nativeaot/Runtime/CMakeLists.txt | Adds asyncsafethreadmap files to build for Unix (non-WASM) targets |
jkotas
approved these changes
Feb 12, 2026
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #122513 to release/10.0
Customer Impact
The current CheckActivationSafePoint uses thread local storage to
get the current Thread instance. But this function is called from
async signal handler (the activation signal handler) and it is not
allowed to access TLS variables there because the access can allocate
and if the interrupted code was running in an allocation code, it
could crash.
There was no problem with this since .NET 1.0, but a change in the
recent glibc version has broken this. We've got reports of crashes
in this code e.g on recent Ubuntu 25.04 due to this issue.
Regression
Testing
CI tests, local manual directed tests
Risk
Low, the change was in main for the past two months and we haven't seen any issues related to it. The newly added code is executed very frequently.