Skip to content

Optimize runtime async suspend/resume machinery#127336

Merged
jakobbotsch merged 8 commits intodotnet:mainfrom
jakobbotsch:unsafify-runtime-async
Apr 25, 2026
Merged

Optimize runtime async suspend/resume machinery#127336
jakobbotsch merged 8 commits intodotnet:mainfrom
jakobbotsch:unsafify-runtime-async

Conversation

@jakobbotsch
Copy link
Copy Markdown
Member

@jakobbotsch jakobbotsch commented Apr 23, 2026

Several optimizations around suspension/resumption:

  • Reduce number of TLS accesses by storing Thread.CurrentThread inside RuntimeAsyncAwaitState, and only accessing RuntimeAsyncAwaitState
  • Remove a number of write barriers by moving TLS object fields into a ref struct. Allocate this ref struct on the stack in the two places that initiate runtime async chains: task-returning thunks and DispatchContinuations. Keep a pointer to this in the TLS.
  • Use Unsafe in a couple of places to avoid unnecessary cast checks on the hot path

For a suspension heavy benchmark this improves performance by around 17%.

Example benchmark
using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;
using System.Threading;
using System.Threading.Tasks;

namespace OSRPerf;

public class Program
{
    static void Main()
    {
        NullAwaiter na = new NullAwaiter();

        for (int i = 0; i < 10; i++)
        {
            for (int j = 0; j < 500; j++)
            {
                Task t = Foo(20, na);
                while (!t.IsCompleted)
                {
                    na.Continue();
                }
            }

            Thread.Sleep(100);
        }

        for (int i = 0; i < 50; i++)
        {
            Task t = Foo(10_000_000, na);
            while (!t.IsCompleted)
            {
                na.Continue();
            }
        }
    }

    static int s_value;
    static async Task Foo(int n, NullAwaiter na)
    {
        for (int i = 0; i < n; i++)
        {
            s_value += i;
        }

        Stopwatch timer = Stopwatch.StartNew();
        for (int i = 0; i < n; i++)
        {
            await na;
        }
        if (n > 1000)
            Console.WriteLine("Took {0:F1} ms", timer.Elapsed.TotalMilliseconds);
    }

    private class NullAwaiter : ICriticalNotifyCompletion
    {
        public Action Continue;

        public NullAwaiter GetAwaiter() => this;

        public bool IsCompleted => false;

        public void GetResult()
        {
        }

        public void UnsafeOnCompleted(Action continuation)
        {
            Continue = continuation;
        }

        public void OnCompleted(Action continuation)
        {
            throw new NotImplementedException();
        }
    }
}

Before: Took 350.3 ms
After: Took 291.3 ms

@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke
See info in area-owners.md if you want to be subscribed.

Several optimizations around suspension/resumption:
- Reduce number of TLS accesses by storing `Thread.CurrentThread` and
  `&AsyncDispatcherInfo.t_current` inside `RuntimeAsyncAwaitState`, and
  only accessing `RuntimeAsyncAwaitState`
- Remove a number of write barriers by moving TLS object fields into a
  `ref struct`. Allocate this ref struct on the stack in the two places
  that initiate runtime async chains: task-returning thunks and
  `DispatchContinuations`. Keep a pointer to this in the TLS.
- Use `Unsafe` in a couple of places to avoid unnecessary cast checks on
  the hot path

For a suspension heavy benchmark this improves performance by around
25%.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes CoreCLR “runtime async” suspension/resumption by reducing TLS traffic, minimizing write barriers on hot paths, and consolidating context-handling work into new helpers used by the JIT’s async transformation.

Changes:

  • Refactors runtime-async state to cache TLS-derived values and to move notifier/context references into a stack-allocated “stack state” accessed via the thread-static await state.
  • Extends CORINFO_ASYNC_INFO and JIT/EE plumbing with new “finish suspension” helper method handles, and updates the JIT async transform to use them.
  • Updates task-returning thunk emission (VM + ILCompiler stubs) and related tooling (SuperPMI, R2R/AOT scanners) to match the new runtime-async APIs.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs Exposes InstanceIsFlowSuppressed for optimized flow-suppression checks.
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.cs Updates leaf await helpers to use stack-backed runtime-async state.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs Introduces stack-based runtime-async state, new finish-suspension helpers, and updates dispatch/suspension logic.
src/coreclr/inc/corinfo.h Extends CORINFO_ASYNC_INFO with new helper method handles.
src/coreclr/vm/jitinterface.cpp Populates new async helper handles in CEEInfo::getAsyncInfo.
src/coreclr/vm/metasig.h Adds metasigs for updated thunk finalization signatures.
src/coreclr/vm/corelib.h Updates CoreLib binder entries for new helpers, thunk signatures, and runtime-async nested types/field.
src/coreclr/vm/asyncthunks.cpp Updates task-returning thunk IL emission to push/pop the new await state and pass it to finalizers.
src/coreclr/jit/async.h Declares new JIT helper routines to finish suspension context handling.
src/coreclr/jit/async.cpp Reworks suspension context handling to use new finish-suspension helpers and adjusts capture/restore sequence.
src/coreclr/tools/superpmi/superpmi-shared/agnostic.h Extends SuperPMI agnostic async-info struct for new handles.
src/coreclr/tools/superpmi/superpmi-shared/methodcontext.cpp Records/replays new async-info handles.
src/coreclr/tools/Common/JitInterface/CorInfoTypes.cs Extends managed projection of CORINFO_ASYNC_INFO.
src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs Emits new helper handles for the managed JIT interface implementation.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/ReadyToRunCodegenCompilation.cs Adds R2R references to new finish-suspension helpers.
src/coreclr/tools/aot/ILCompiler.Compiler/IL/ILImporter.Scanner.cs Ensures AOT scanning adds dependencies on new finish-suspension helpers.
src/coreclr/tools/Common/TypeSystem/IL/Stubs/AsyncThunks.cs Updates IL stub emission for task-returning thunks to use new await-state push/pop + finalizer signatures.

Copilot AI review requested due to automatic review settings April 23, 2026 18:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes CoreCLR “runtime async” suspend/resume paths by reducing TLS accesses, lowering GC write barrier traffic, and avoiding some hot-path cast checks. It does so by introducing a stack-allocated state container that’s referenced via a per-thread TLS struct and by updating the thunk emitters to pass the TLS state byref.

Changes:

  • Introduces a stack-allocated RuntimeAsyncStackState and threads it through runtime-async chains via RuntimeAsyncAwaitState.Push/Pop.
  • Updates runtime-emitted task-returning thunks (VM + IL emitter) to initialize/teardown the new TLS stack state and pass ref RuntimeAsyncAwaitState into finalize helpers.
  • Switches a few hot-path casts to Unsafe.As and refactors await helpers to write notifier/context data into the stack state.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.cs Updates public await helper intrinsics to write notifier into stack state via TLS.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs Replaces ExecutionAndSyncBlockStore with stack state + TLS Push/Pop, updates dispatch/finalize/handle-suspend flow.
src/coreclr/vm/metasig.h Adds metasigs for finalize helpers that now take ref RuntimeAsyncAwaitState.
src/coreclr/vm/corelib.h Updates CoreLib binder entries for new TLS field, nested types, and finalize helper signatures.
src/coreclr/vm/asyncthunks.cpp Updates IL stub emission to Push/Pop TLS stack state and pass ref state into finalize helpers.
src/coreclr/tools/Common/TypeSystem/IL/Stubs/AsyncThunks.cs Mirrors VM thunk emission changes in the managed IL emitter.

Copilot AI review requested due to automatic review settings April 24, 2026 08:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes CoreCLR “runtime async” suspension/resumption by reducing TLS lookups and write barriers, primarily by introducing a stack-allocated async state block that’s referenced from TLS during async-chain execution.

Changes:

  • Rework async await state handling to route notifier/context storage through a stack-allocated RuntimeAsyncStackState linked via TLS RuntimeAsyncAwaitState.
  • Update CoreCLR async thunk IL emission (VM + managed emitter) to Push/Pop the new TLS state and to pass the TLS state byref into Finalize*ReturningThunk.
  • Adjust CoreLib binder/metasig definitions to match the new helper signatures and types.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.cs Update leaf await helpers to write notifier data into stack state via t_runtimeAsyncAwaitState.StackState.
src/coreclr/vm/metasig.h Add new metasigs for Finalize*ReturningThunk(ref RuntimeAsyncAwaitState) (Task/ValueTask, generic and non-generic).
src/coreclr/vm/corelib.h Update binder definitions: remove ExecutionAndSyncBlockStore, add TLS field and nested async state types/methods.
src/coreclr/vm/asyncthunks.cpp Update VM-emitted task-returning thunk IL to initialize and push/pop the new runtime async state and pass it to finalizers.
src/coreclr/tools/Common/TypeSystem/IL/Stubs/AsyncThunks.cs Mirror VM thunk emission updates in the managed IL emitter (push/pop + updated finalizer signatures).
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs Implement new stack/TLS async state structs, update dispatch/suspension logic, and adjust hot-path casts using Unsafe.

@jakobbotsch jakobbotsch marked this pull request as ready for review April 24, 2026 15:17
@jakobbotsch
Copy link
Copy Markdown
Member Author

PTAL @VSadov

@jakobbotsch jakobbotsch requested a review from VSadov April 24, 2026 15:17
Copy link
Copy Markdown
Member

@VSadov VSadov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Copilot AI review requested due to automatic review settings April 25, 2026 13:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes CoreCLR’s runtime-async suspend/resume path by reducing TLS traffic and GC write barriers, primarily by moving per-suspension state into a stack-allocated ref struct and caching Thread.CurrentThread in the TLS state.

Changes:

  • Introduce stack-allocated runtime-async state (RuntimeAsyncStackState) and keep only a pointer to it in TLS (RuntimeAsyncAwaitState).
  • Update task-returning thunk emission (VM + managed typesystem) to Push/Pop runtime-async state and pass ref RuntimeAsyncAwaitState into finalization helpers.
  • Update CoreLib binder signatures (corelib.h / metasig.h) to match the new helper method signatures and new nested types/fields.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.cs Switch await helpers to use stack-state via t_runtimeAsyncAwaitState.StackState and reduce TLS accesses.
src/coreclr/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncHelpers.CoreCLR.cs Add RuntimeAsyncStackState + new TLS layout; adjust suspension/dispatch/finalization paths; use Unsafe.As for hot casts.
src/coreclr/vm/asyncthunks.cpp Update VM-emitted task-returning thunk IL to Push/Pop runtime-async state and call new finalize signatures.
src/coreclr/tools/Common/TypeSystem/IL/Stubs/AsyncThunks.cs Mirror VM thunk emission changes in the managed typesystem IL stub emitter.
src/coreclr/vm/metasig.h Add metasig variants for finalize helpers that take ref RuntimeAsyncAwaitState.
src/coreclr/vm/corelib.h Bind new nested types/field and update method signatures used by the VM binder.

Comment thread src/coreclr/vm/asyncthunks.cpp
Comment thread src/coreclr/tools/Common/TypeSystem/IL/Stubs/AsyncThunks.cs
@jakobbotsch jakobbotsch merged commit 7e17692 into dotnet:main Apr 25, 2026
159 checks passed
@jakobbotsch jakobbotsch deleted the unsafify-runtime-async branch April 25, 2026 19:45
@rcj1 rcj1 mentioned this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants