Skip to content

JIT: Add a "default value analysis" for async hoisting#124514

Open
jakobbotsch wants to merge 23 commits intodotnet:mainfrom
jakobbotsch:fix-124512
Open

JIT: Add a "default value analysis" for async hoisting#124514
jakobbotsch wants to merge 23 commits intodotnet:mainfrom
jakobbotsch:fix-124512

Conversation

@jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Feb 17, 2026

This analysis computes the set of locals that still have their default value and uses it to skip hoisting these locals into continuations around async suspension points.

Fix #124512

Example from #124512:

using System;
using System.Runtime.CompilerServices;
using System.Threading.Tasks;

public class Program
{
    public static async Task Main()
    {
        await FooAsync();
    }

    private static async Task FooAsync()
    {
        await Task.Yield();
        Inline();
    }

    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    private static void Inline()
    {
        S s = new S();
        Console.WriteLine(s.A);
    }

    private struct S
    {
        public long A, B, C, D, E;
        public object F;

        [MethodImpl(MethodImplOptions.NoInlining)]
        public S()
        {
            A = B = C = D = E = 10;
            F = null;
        }
    }
}
...
@@ -137,15 +130,10 @@ G_M000_IG19:
        mov      rdx, 0xD1FFAB1E
        call     [CORINFO_HELP_ALLOC_CONTINUATION]
        mov      rbx, rax
-       lea      rcx, [reloc @RWD00]
-       mov      qword ptr [rbx+0x10], rcx
-       xor      ecx, ecx
-       mov      qword ptr [rbx+0x18], rcx
-       lea      rdi, bword ptr [rbx+0x28]
-       lea      rsi, bword ptr [rbp-0x58]
-       call     CORINFO_HELP_ASSIGN_BYREF
-       mov      ecx, 5
-       rep movsq 
+       lea      rax, [reloc @RWD00]
+       mov      qword ptr [rbx+0x10], rax
+       xor      eax, eax
+       mov      qword ptr [rbx+0x18], rax
        call     [System.Runtime.CompilerServices.AsyncHelpers:CaptureExecutionContext():System.Threading.ExecutionContext]
        lea      rcx, bword ptr [rbx+0x20]
        mov      rdx, rax
@@ -153,55 +141,43 @@ G_M000_IG19:
        cmp      gword ptr [rbp+0x10], 0
        setne    cl
        movzx    rcx, cl
-       mov      rdx, gword ptr [rbp-0x20]
-       mov      r8, gword ptr [rbp-0x28]
+       mov      rdx, gword ptr [rbp-0x18]
+       mov      r8, gword ptr [rbp-0x20]
        call     [System.Runtime.CompilerServices.AsyncHelpers:RestoreContextsOnSuspension(bool,System.Threading.ExecutionContext,System.Threading.SynchronizationContext)]
        mov      rcx, rbx
  
 G_M000_IG20:
-       vzeroupper 
-       add      rsp, 104
+       add      rsp, 96
        pop      rbx
        pop      rsi
-       pop      rdi
        pop      rbp
        ret      
  
 G_M000_IG21:
-       mov      rdx, gword ptr [rbp+0x10]
-       mov      rcx, gword ptr [rdx+0x20]
+       mov      rcx, gword ptr [rbp+0x10]
+       mov      rcx, gword ptr [rcx+0x20]
        call     [System.Runtime.CompilerServices.AsyncHelpers:RestoreExecutionContext(System.Threading.ExecutionContext)]
- 
-G_M000_IG22:
-       mov      rdx, gword ptr [rbp+0x10]
-       vmovdqu  ymm0, ymmword ptr [rdx+0x28]
-       vmovdqu  ymmword ptr [rbp-0x58], ymm0
-       vmovdqu  xmm0, xmmword ptr [rdx+0x48]
-       vmovdqu  xmmword ptr [rbp-0x38], xmm0
- 
-G_M000_IG23:
        jmp      G_M000_IG05
  
-G_M000_IG24:
+G_M000_IG22:
        sub      rsp, 40
  
-G_M000_IG25:
+G_M000_IG23:
        cmp      gword ptr [rbp+0x10], 0
        setne    cl
        movzx    rcx, cl
-       mov      rdx, gword ptr [rbp-0x20]
-       mov      r8, gword ptr [rbp-0x28]
+       mov      rdx, gword ptr [rbp-0x18]
+       mov      r8, gword ptr [rbp-0x20]
        call     [System.Runtime.CompilerServices.AsyncHelpers:RestoreContexts(bool,System.Threading.ExecutionContext,System.Threading.SynchronizationContext)]
        nop      
  
-G_M000_IG26:
-       vzeroupper 
+G_M000_IG24:
        add      rsp, 40
        ret      
  
 RWD00  	dq	(dynamicClass):IL_STUB_AsyncResume(System.Object,byref):System.Object
 	dq	G_M000_IG19
 
-; Total bytes of code 544
+; Total bytes of code 471

Copilot AI review requested due to automatic review settings February 17, 2026 14:31
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 17, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new "default value analysis" for async hoisting to address issue #124512, where the JIT incorrectly hoists default-valued locals at async suspension points. The issue occurs when the JIT suppresses explicit zeroing of locals (expecting them to be zeroed by the prolog), but then hoists these uninitialized locals into async continuations when their address is taken.

Changes:

  • Introduces a two-phase DefaultValueAnalysis class that tracks which locals retain their default (zero) values through program execution
  • Integrates this analysis with the existing AsyncLiveness class to skip hoisting locals that are live but still have default values at suspension points
  • Handles promoted struct fields correctly by checking if all fields have default values

Comment on lines 580 to 581
// - It has a LCL_ADDR use with GTF_VAR_DEF (address taken for a
// definition whose result we cannot reason about).
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions "LCL_ADDR use with GTF_VAR_DEF" but the implementation at lines 616-622 treats ALL GT_LCL_ADDR nodes as mutations without checking the GTF_VAR_DEF flag. This is actually correct behavior (conservative approach: we cannot reason about how an address will be used), but the comment is misleading. Consider either updating the comment to remove the "with GTF_VAR_DEF" qualification, or add a note explaining why all LCL_ADDR uses are treated as potential mutations.

Suggested change
// - It has a LCL_ADDR use with GTF_VAR_DEF (address taken for a
// definition whose result we cannot reason about).
// - It has a LCL_ADDR use (address taken; since we cannot reason about how
// the address will be used, we conservatively treat this as a mutation).

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 18, 2026 15:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI review requested due to automatic review settings February 19, 2026 13:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

src/coreclr/jit/async.cpp:597

  • There's an extra line separator here (line 597). Following the pattern seen elsewhere in the file, function header comments should start with a single line separator, not a double one.
//------------------------------------------------------------------------

Comment on lines 450 to 452
// 2. Inter-block: forward dataflow to propagate default value information
// across blocks. At merge points the sets are intersected (a local has
// default value only if it has default value on all incoming paths).
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is misleading. It says "At merge points the sets are intersected" but the implementation actually uses union (line 497). This is actually correct behavior, but the comment is confusing.

The analysis tracks which locals are mutated (not which have default values). A local is mutated if it's mutated on ANY path (union), not ALL paths (intersection). The comment should clarify that the analysis computes "mutated vars" using union, not "default value" using intersection, even though the end result is equivalent (a var has default value iff it's NOT in the mutated set).

Suggested change
// 2. Inter-block: forward dataflow to propagate default value information
// across blocks. At merge points the sets are intersected (a local has
// default value only if it has default value on all incoming paths).
// 2. Inter-block: forward dataflow to propagate mutated-variable
// information across blocks. At merge points the sets of mutated
// locals are unioned (a local is considered mutated if it is mutated
// on any incoming path). Locals that still have their default value
// at a point are those not in the mutated set there.

Copilot uses AI. Check for mistakes.
Comment on lines 425 to 593
//------------------------------------------------------------------------
// IsDefaultValue:
// Check if a node represents a default (zero) value.
//
// Parameters:
// node - The node to check.
//
// Returns:
// True if the node is a constant zero value (integral, floating-point, or
// vector).
//
static bool IsDefaultValue(GenTree* node)
{
return node->IsIntegralConst(0) || node->IsFloatPositiveZero() || node->IsVectorZero();
}

//------------------------------------------------------------------------
// DefaultValueAnalysis:
// Computes which tracked locals have their default (zero) value at each
// basic block entry. A tracked local that still has its default value at a
// suspension point does not need to be hoisted into the continuation.
//
// The analysis has two phases:
// 1. Per-block: compute which tracked locals are mutated (assigned a
// non-default value or have their address taken) in each block.
// 2. Inter-block: forward dataflow to propagate default value information
// across blocks. At merge points the sets are intersected (a local has
// default value only if it has default value on all incoming paths).
//
class DefaultValueAnalysis
{
Compiler* m_compiler;
VARSET_TP* m_mutatedVars; // Per-block set of locals mutated to non-default.
VARSET_TP* m_mutatedVarsIn; // Per-block set of locals mutated to non-default on entry.
VARSET_TP m_mutatedAtEntry; // Locals that are mutated at method entry (params, OSR locals).

// DataFlow::ForwardAnalysis callback used in Phase 2.
class DataFlowCallback
{
DefaultValueAnalysis& m_analysis;
Compiler* m_compiler;
bool m_isFirstPred;
VARSET_TP m_preMergeIn;

public:
DataFlowCallback(DefaultValueAnalysis& analysis, Compiler* compiler)
: m_analysis(analysis)
, m_compiler(compiler)
, m_isFirstPred(true)
, m_preMergeIn(VarSetOps::MakeEmpty(compiler))
{
}

void StartMerge(BasicBlock* block)
{
// Save the current in set for change detection later.
VarSetOps::Assign(m_compiler, m_preMergeIn, m_analysis.m_mutatedVarsIn[block->bbNum]);

// Optimistically assume no locals have been mutated; Merge will
// grow via union.
VarSetOps::ClearD(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum]);
m_isFirstPred = true;
}

void Merge(BasicBlock* block, BasicBlock* predBlock, unsigned dupCount)
{
// The out set of a predecessor is its in set plus the locals
// mutated in that block: mutatedOut = mutatedIn | mutated.
VARSET_TP predOut(VarSetOps::MakeCopy(m_compiler, m_analysis.m_mutatedVarsIn[predBlock->bbNum]));
VarSetOps::UnionD(m_compiler, predOut, m_analysis.m_mutatedVars[predBlock->bbNum]);

// Union: a local is mutated if it is mutated on any incoming path.
VarSetOps::UnionD(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum], predOut);
m_isFirstPred = false;
}

void MergeHandler(BasicBlock* block, BasicBlock* firstTryBlock, BasicBlock* lastTryBlock)
{
// A handler can be reached from any point in the try region.
// A local is mutated at handler entry if it was mutated at try
// entry or mutated anywhere within the try region.
VARSET_TP tryMutated(VarSetOps::MakeCopy(m_compiler, m_analysis.m_mutatedVarsIn[firstTryBlock->bbNum]));

for (BasicBlock* tryBlock = firstTryBlock; tryBlock != lastTryBlock->Next(); tryBlock = tryBlock->Next())
{
VarSetOps::UnionD(m_compiler, tryMutated, m_analysis.m_mutatedVars[tryBlock->bbNum]);
}

VarSetOps::UnionD(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum], tryMutated);
m_isFirstPred = false;
}

bool EndMerge(BasicBlock* block)
{
if (m_isFirstPred)
{
// No predecessors (entry block or unreachable). Parameters
// and OSR locals are considered mutated at method entry.
VarSetOps::Assign(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum], m_analysis.m_mutatedAtEntry);
}

return !VarSetOps::Equal(m_compiler, m_preMergeIn, m_analysis.m_mutatedVarsIn[block->bbNum]);
}
};

public:
DefaultValueAnalysis(Compiler* compiler)
: m_compiler(compiler)
, m_mutatedVars(nullptr)
, m_mutatedVarsIn(nullptr)
, m_mutatedAtEntry(VarSetOps::MakeEmpty(compiler))
{
}

void Run();
const VARSET_TP& GetMutatedVarsIn(BasicBlock* block) const;

private:
void ComputePerBlockMutatedVars();
void ComputeInterBlockDefaultValues();

INDEBUG(void DumpMutatedVars());
INDEBUG(void DumpMutatedVarsIn());
};

//------------------------------------------------------------------------
// DefaultValueAnalysis::Run:
// Run the default value analysis: compute per-block mutation sets, then
// propagate default value information forward through the flow graph.
//
void DefaultValueAnalysis::Run()
{
#ifdef DEBUG
static ConfigMethodRange s_range;
s_range.EnsureInit(JitConfig.JitAsyncDefaultValueAnalysisRange());

if (!s_range.Contains(m_compiler->info.compMethodHash()))
{
JITDUMP("Default value analysis disabled because of method range\n");
m_mutatedVarsIn = m_compiler->fgAllocateTypeForEachBlk<VARSET_TP>(CMK_Async);
for (BasicBlock* block : m_compiler->Blocks())
{
VarSetOps::AssignNoCopy(m_compiler, m_mutatedVarsIn[block->bbNum], VarSetOps::MakeFull(m_compiler));
}

return;
}

#endif
ComputePerBlockMutatedVars();
ComputeInterBlockDefaultValues();
}

//------------------------------------------------------------------------
// DefaultValueAnalysis::GetMutatedVarsIn:
// Get the set of tracked locals that have been mutated to a non-default
// value on entry to the specified block.
//
// Parameters:
// block - The basic block.
//
// Returns:
// The VARSET_TP of tracked locals mutated on entry. A local NOT in this
// set is guaranteed to have its default value.
//
const VARSET_TP& DefaultValueAnalysis::GetMutatedVarsIn(BasicBlock* block) const
{
assert(m_mutatedVarsIn != nullptr);
return m_mutatedVarsIn[block->bbNum];
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds significant new JIT functionality (default value analysis for async methods) but does not include any tests. While this is described as an "experiment" and is gated behind a DEBUG-only config option (JitAsyncDefaultValueAnalysisRange), it would be valuable to add at least one basic test case that exercises this code path to verify the optimization works correctly. Consider adding a test similar to the repro from issue #124512 that validates the struct is not unnecessarily hoisted when it has its default value at the suspension point.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 19, 2026 14:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

Copilot AI review requested due to automatic review settings February 19, 2026 14:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/coreclr/jit/async.cpp:583

  • There are duplicate comment separator lines (------------------------------------------------------------------------). The second separator on line 583 should be removed as it's redundant.
//------------------------------------------------------------------------
//------------------------------------------------------------------------

@jakobbotsch
Copy link
Member Author

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Diffs. Throughput wise pretty much a wash.

The analysis here was written almost completely autonomously by Copilot with Opus 4.6. I made a few changes:

  • I added handling for promoted locals
  • The bit vector handling was pretty wasteful, so I optimized it a bit
  • I added correct handling for locals not initialized in prolog

Still, I was quite impressed by the result.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

@jakobbotsch
Copy link
Member Author

/azp run runtime-nativeaot-outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JIT: Investigate interaction between suppressions of explicit inits and async hoisting

1 participant

Comments