JIT: Add a "default value analysis" for async hoisting#124514
JIT: Add a "default value analysis" for async hoisting#124514jakobbotsch wants to merge 23 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR adds a new "default value analysis" for async hoisting to address issue #124512, where the JIT incorrectly hoists default-valued locals at async suspension points. The issue occurs when the JIT suppresses explicit zeroing of locals (expecting them to be zeroed by the prolog), but then hoists these uninitialized locals into async continuations when their address is taken.
Changes:
- Introduces a two-phase
DefaultValueAnalysisclass that tracks which locals retain their default (zero) values through program execution - Integrates this analysis with the existing
AsyncLivenessclass to skip hoisting locals that are live but still have default values at suspension points - Handles promoted struct fields correctly by checking if all fields have default values
src/coreclr/jit/async.cpp
Outdated
| // - It has a LCL_ADDR use with GTF_VAR_DEF (address taken for a | ||
| // definition whose result we cannot reason about). |
There was a problem hiding this comment.
The comment mentions "LCL_ADDR use with GTF_VAR_DEF" but the implementation at lines 616-622 treats ALL GT_LCL_ADDR nodes as mutations without checking the GTF_VAR_DEF flag. This is actually correct behavior (conservative approach: we cannot reason about how an address will be used), but the comment is misleading. Consider either updating the comment to remove the "with GTF_VAR_DEF" qualification, or add a note explaining why all LCL_ADDR uses are treated as potential mutations.
| // - It has a LCL_ADDR use with GTF_VAR_DEF (address taken for a | |
| // definition whose result we cannot reason about). | |
| // - It has a LCL_ADDR use (address taken; since we cannot reason about how | |
| // the address will be used, we conservatively treat this as a mutation). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
src/coreclr/jit/async.cpp:597
- There's an extra line separator here (line 597). Following the pattern seen elsewhere in the file, function header comments should start with a single line separator, not a double one.
//------------------------------------------------------------------------
src/coreclr/jit/async.cpp
Outdated
| // 2. Inter-block: forward dataflow to propagate default value information | ||
| // across blocks. At merge points the sets are intersected (a local has | ||
| // default value only if it has default value on all incoming paths). |
There was a problem hiding this comment.
The comment here is misleading. It says "At merge points the sets are intersected" but the implementation actually uses union (line 497). This is actually correct behavior, but the comment is confusing.
The analysis tracks which locals are mutated (not which have default values). A local is mutated if it's mutated on ANY path (union), not ALL paths (intersection). The comment should clarify that the analysis computes "mutated vars" using union, not "default value" using intersection, even though the end result is equivalent (a var has default value iff it's NOT in the mutated set).
| // 2. Inter-block: forward dataflow to propagate default value information | |
| // across blocks. At merge points the sets are intersected (a local has | |
| // default value only if it has default value on all incoming paths). | |
| // 2. Inter-block: forward dataflow to propagate mutated-variable | |
| // information across blocks. At merge points the sets of mutated | |
| // locals are unioned (a local is considered mutated if it is mutated | |
| // on any incoming path). Locals that still have their default value | |
| // at a point are those not in the mutated set there. |
src/coreclr/jit/async.cpp
Outdated
| //------------------------------------------------------------------------ | ||
| // IsDefaultValue: | ||
| // Check if a node represents a default (zero) value. | ||
| // | ||
| // Parameters: | ||
| // node - The node to check. | ||
| // | ||
| // Returns: | ||
| // True if the node is a constant zero value (integral, floating-point, or | ||
| // vector). | ||
| // | ||
| static bool IsDefaultValue(GenTree* node) | ||
| { | ||
| return node->IsIntegralConst(0) || node->IsFloatPositiveZero() || node->IsVectorZero(); | ||
| } | ||
|
|
||
| //------------------------------------------------------------------------ | ||
| // DefaultValueAnalysis: | ||
| // Computes which tracked locals have their default (zero) value at each | ||
| // basic block entry. A tracked local that still has its default value at a | ||
| // suspension point does not need to be hoisted into the continuation. | ||
| // | ||
| // The analysis has two phases: | ||
| // 1. Per-block: compute which tracked locals are mutated (assigned a | ||
| // non-default value or have their address taken) in each block. | ||
| // 2. Inter-block: forward dataflow to propagate default value information | ||
| // across blocks. At merge points the sets are intersected (a local has | ||
| // default value only if it has default value on all incoming paths). | ||
| // | ||
| class DefaultValueAnalysis | ||
| { | ||
| Compiler* m_compiler; | ||
| VARSET_TP* m_mutatedVars; // Per-block set of locals mutated to non-default. | ||
| VARSET_TP* m_mutatedVarsIn; // Per-block set of locals mutated to non-default on entry. | ||
| VARSET_TP m_mutatedAtEntry; // Locals that are mutated at method entry (params, OSR locals). | ||
|
|
||
| // DataFlow::ForwardAnalysis callback used in Phase 2. | ||
| class DataFlowCallback | ||
| { | ||
| DefaultValueAnalysis& m_analysis; | ||
| Compiler* m_compiler; | ||
| bool m_isFirstPred; | ||
| VARSET_TP m_preMergeIn; | ||
|
|
||
| public: | ||
| DataFlowCallback(DefaultValueAnalysis& analysis, Compiler* compiler) | ||
| : m_analysis(analysis) | ||
| , m_compiler(compiler) | ||
| , m_isFirstPred(true) | ||
| , m_preMergeIn(VarSetOps::MakeEmpty(compiler)) | ||
| { | ||
| } | ||
|
|
||
| void StartMerge(BasicBlock* block) | ||
| { | ||
| // Save the current in set for change detection later. | ||
| VarSetOps::Assign(m_compiler, m_preMergeIn, m_analysis.m_mutatedVarsIn[block->bbNum]); | ||
|
|
||
| // Optimistically assume no locals have been mutated; Merge will | ||
| // grow via union. | ||
| VarSetOps::ClearD(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum]); | ||
| m_isFirstPred = true; | ||
| } | ||
|
|
||
| void Merge(BasicBlock* block, BasicBlock* predBlock, unsigned dupCount) | ||
| { | ||
| // The out set of a predecessor is its in set plus the locals | ||
| // mutated in that block: mutatedOut = mutatedIn | mutated. | ||
| VARSET_TP predOut(VarSetOps::MakeCopy(m_compiler, m_analysis.m_mutatedVarsIn[predBlock->bbNum])); | ||
| VarSetOps::UnionD(m_compiler, predOut, m_analysis.m_mutatedVars[predBlock->bbNum]); | ||
|
|
||
| // Union: a local is mutated if it is mutated on any incoming path. | ||
| VarSetOps::UnionD(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum], predOut); | ||
| m_isFirstPred = false; | ||
| } | ||
|
|
||
| void MergeHandler(BasicBlock* block, BasicBlock* firstTryBlock, BasicBlock* lastTryBlock) | ||
| { | ||
| // A handler can be reached from any point in the try region. | ||
| // A local is mutated at handler entry if it was mutated at try | ||
| // entry or mutated anywhere within the try region. | ||
| VARSET_TP tryMutated(VarSetOps::MakeCopy(m_compiler, m_analysis.m_mutatedVarsIn[firstTryBlock->bbNum])); | ||
|
|
||
| for (BasicBlock* tryBlock = firstTryBlock; tryBlock != lastTryBlock->Next(); tryBlock = tryBlock->Next()) | ||
| { | ||
| VarSetOps::UnionD(m_compiler, tryMutated, m_analysis.m_mutatedVars[tryBlock->bbNum]); | ||
| } | ||
|
|
||
| VarSetOps::UnionD(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum], tryMutated); | ||
| m_isFirstPred = false; | ||
| } | ||
|
|
||
| bool EndMerge(BasicBlock* block) | ||
| { | ||
| if (m_isFirstPred) | ||
| { | ||
| // No predecessors (entry block or unreachable). Parameters | ||
| // and OSR locals are considered mutated at method entry. | ||
| VarSetOps::Assign(m_compiler, m_analysis.m_mutatedVarsIn[block->bbNum], m_analysis.m_mutatedAtEntry); | ||
| } | ||
|
|
||
| return !VarSetOps::Equal(m_compiler, m_preMergeIn, m_analysis.m_mutatedVarsIn[block->bbNum]); | ||
| } | ||
| }; | ||
|
|
||
| public: | ||
| DefaultValueAnalysis(Compiler* compiler) | ||
| : m_compiler(compiler) | ||
| , m_mutatedVars(nullptr) | ||
| , m_mutatedVarsIn(nullptr) | ||
| , m_mutatedAtEntry(VarSetOps::MakeEmpty(compiler)) | ||
| { | ||
| } | ||
|
|
||
| void Run(); | ||
| const VARSET_TP& GetMutatedVarsIn(BasicBlock* block) const; | ||
|
|
||
| private: | ||
| void ComputePerBlockMutatedVars(); | ||
| void ComputeInterBlockDefaultValues(); | ||
|
|
||
| INDEBUG(void DumpMutatedVars()); | ||
| INDEBUG(void DumpMutatedVarsIn()); | ||
| }; | ||
|
|
||
| //------------------------------------------------------------------------ | ||
| // DefaultValueAnalysis::Run: | ||
| // Run the default value analysis: compute per-block mutation sets, then | ||
| // propagate default value information forward through the flow graph. | ||
| // | ||
| void DefaultValueAnalysis::Run() | ||
| { | ||
| #ifdef DEBUG | ||
| static ConfigMethodRange s_range; | ||
| s_range.EnsureInit(JitConfig.JitAsyncDefaultValueAnalysisRange()); | ||
|
|
||
| if (!s_range.Contains(m_compiler->info.compMethodHash())) | ||
| { | ||
| JITDUMP("Default value analysis disabled because of method range\n"); | ||
| m_mutatedVarsIn = m_compiler->fgAllocateTypeForEachBlk<VARSET_TP>(CMK_Async); | ||
| for (BasicBlock* block : m_compiler->Blocks()) | ||
| { | ||
| VarSetOps::AssignNoCopy(m_compiler, m_mutatedVarsIn[block->bbNum], VarSetOps::MakeFull(m_compiler)); | ||
| } | ||
|
|
||
| return; | ||
| } | ||
|
|
||
| #endif | ||
| ComputePerBlockMutatedVars(); | ||
| ComputeInterBlockDefaultValues(); | ||
| } | ||
|
|
||
| //------------------------------------------------------------------------ | ||
| // DefaultValueAnalysis::GetMutatedVarsIn: | ||
| // Get the set of tracked locals that have been mutated to a non-default | ||
| // value on entry to the specified block. | ||
| // | ||
| // Parameters: | ||
| // block - The basic block. | ||
| // | ||
| // Returns: | ||
| // The VARSET_TP of tracked locals mutated on entry. A local NOT in this | ||
| // set is guaranteed to have its default value. | ||
| // | ||
| const VARSET_TP& DefaultValueAnalysis::GetMutatedVarsIn(BasicBlock* block) const | ||
| { | ||
| assert(m_mutatedVarsIn != nullptr); | ||
| return m_mutatedVarsIn[block->bbNum]; |
There was a problem hiding this comment.
This PR adds significant new JIT functionality (default value analysis for async methods) but does not include any tests. While this is described as an "experiment" and is gated behind a DEBUG-only config option (JitAsyncDefaultValueAnalysisRange), it would be valuable to add at least one basic test case that exercises this code path to verify the optimization works correctly. Consider adding a test similar to the repro from issue #124512 that validates the struct is not unnecessarily hoisted when it has its default value at the suspension point.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/coreclr/jit/async.cpp:583
- There are duplicate comment separator lines (------------------------------------------------------------------------). The second separator on line 583 should be removed as it's redundant.
//------------------------------------------------------------------------
//------------------------------------------------------------------------
|
cc @dotnet/jit-contrib PTAL @AndyAyersMS Diffs. Throughput wise pretty much a wash. The analysis here was written almost completely autonomously by Copilot with Opus 4.6. I made a few changes:
Still, I was quite impressed by the result. |
|
/azp run runtime-nativeaot-outerloop |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This analysis computes the set of locals that still have their default value and uses it to skip hoisting these locals into continuations around async suspension points.
Fix #124512
Example from #124512: