Skip to content

[Wasm RyuJIT] Wire up GC info encoding/decoding for Wasm#126932

Draft
kg wants to merge 2 commits intodotnet:mainfrom
kg:wasm-gcinfo-1
Draft

[Wasm RyuJIT] Wire up GC info encoding/decoding for Wasm#126932
kg wants to merge 2 commits intodotnet:mainfrom
kg:wasm-gcinfo-1

Conversation

@kg
Copy link
Copy Markdown
Member

@kg kg commented Apr 15, 2026

This set of changes is sufficient to R2R release S.P.CoreLib like before but with gc info encoding wired up and working. Still figuring out how to clean some of it up, feedback welcome.

@kg kg added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Apr 15, 2026
Copilot AI review requested due to automatic review settings April 15, 2026 03:04
@kg kg added NO-REVIEW Experimental/testing PR, do NOT review it arch-wasm WebAssembly architecture labels Apr 15, 2026
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 15, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables GC info encoding/decoding support for the CoreCLR WASM RyuJIT backend by turning on GCInfo generation, adding the necessary encoder/decoder sources to the WASM JIT build, and introducing a WASM-specific GC info encoding definition.

Changes:

  • Enable EMIT_GENERATE_GCINFO for WASM and wire up CodeGen::genCreateAndStoreGCInfo to use GcInfoEncoder.
  • Add/adjust conditional compilation to skip register-GC-tracking paths on WASM while still enabling GC info generation.
  • Update CMake wiring to link WASM standalone JITs against the appropriate gcinfo_* library and add a universal wasm gcinfo target.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/targetwasm.h Enables GC info generation for WASM (but comment needs updating).
src/coreclr/jit/gcinfo.cpp Disables register pointer marking path for WASM.
src/coreclr/jit/gcencode.cpp Disables register-state-change recording for WASM.
src/coreclr/jit/emit.cpp Disables register-GC live update emission for WASM.
src/coreclr/jit/codegenwasm.cpp Implements GCInfo creation/emission via GcInfoEncoder for WASM.
src/coreclr/jit/codegenlinear.cpp Skips GC register update/debug validation paths for WASM.
src/coreclr/jit/codegencommon.cpp Skips register-based GC liveness updates/validation for WASM.
src/coreclr/jit/CMakeLists.txt Ensures standalone JIT links a gcinfo lib and builds gcencode/gcdecode for WASM JIT.
src/coreclr/inc/gcinfotypes.h Adds Wasm32GcInfoEncoding and sets it as the WASM target encoding.
src/coreclr/gcinfo/CMakeLists.txt Adds gcinfo_universal_wasm target (currently under the wrong build condition for the failing scenario).

Comment thread src/coreclr/gcinfo/CMakeLists.txt
Comment thread src/coreclr/jit/targetwasm.h Outdated
@kg
Copy link
Copy Markdown
Member Author

kg commented Apr 16, 2026

Latest problem: with

--singlemethodtypename:"System.AppContext"
--singlemethodname:"OnUnhandledException"
--singlemethodindex:2

We call genEmitUnwindDebugGCandEH multiple times on some methods, and the second call attempts to overwrite existing debug information and (rightly, as far as I can tell) fails. I can't figure out why we're doing this, maybe something to do with funclets?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Comment thread src/coreclr/jit/gcencode.cpp
Comment thread src/coreclr/jit/CMakeLists.txt
@kg
Copy link
Copy Markdown
Member Author

kg commented Apr 17, 2026

Latest problem: with

--singlemethodtypename:"System.AppContext"
--singlemethodname:"OnUnhandledException"
--singlemethodindex:2

We call genEmitUnwindDebugGCandEH multiple times on some methods, and the second call attempts to overwrite existing debug information and (rightly, as far as I can tell) fails. I can't figure out why we're doing this, maybe something to do with funclets?

This was intentional behavior - we were failing the compilation of a method as R2R unsupported late enough in compilation that debug info had already been generated, then hit an assertion the second time we tried to generate debug info (on the retried compilation). I think the asserts are just wrong in this case.

Now failing with block(s) missing a bbEmitCookie in particular methods, i.e.:

--singlemethodtypename:"System.GC"
--singlemethodname:"RunFinalizers"
--singlemethodindex:1
*************** In genSetScopeInfo()
VarLocInfo count is 7
; Variable debug info: 7 live ranges, 4 vars for method System.GC:RunFinalizers():uint
(V00 SP) : From 00000000h to 0000000Ah, in $1
(V01 arg0) : From 0000006Ah to 00000152h, in '[0] (1 slot)
(V01 arg0) : From 00000178h to 000001ECh, in '[0] (1 slot)
(V02 loc1) : From 00000074h to 0000014Ah, in '[56] (1 slot)
(V02 loc1) : From 00000150h to 00000159h, in '[56] (1 slot)
(V02 loc1) : From 00000178h to 000001ECh, in '[56] (1 slot)
(V04 loc3) : From 000000DEh to 000000F2h, in $2
*************** In gcInfoBlockHdrSave()
Set code length to 493.
Set stack base register to NA.
Stack slot id for offset 52 (0x34) (frame) (untracked) = 0.
Z:\runtime\src\coreclr\jit\jiteh.cpp:1120
Assertion failed 'cookie != nullptr' in 'System.GC:RunFinalizers():uint' during 'Emit GC+EH tables' (IL size 81; hash 0x81ce67df; FullOpts)

Copilot AI review requested due to automatic review settings April 17, 2026 01:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Comment thread src/coreclr/jit/codegenlinear.cpp Outdated
@kg kg removed NO-REVIEW Experimental/testing PR, do NOT review it NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) labels Apr 17, 2026
Copilot AI review requested due to automatic review settings April 17, 2026 01:51
@kg kg force-pushed the wasm-gcinfo-1 branch from 5ba8436 to 02ce8d6 Compare April 17, 2026 01:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Comment thread src/coreclr/jit/gcencode.cpp
Copilot AI review requested due to automatic review settings April 20, 2026 23:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/coreclr/jit/codegencommon.cpp:1081

  • This TARGET_WASM guard skips all handling of locals going live in registers. If GC-tracked locals can be assigned to Wasm locals (lvIsInReg()==true), they will bypass both the register path and the stack path here, so their liveness may never be reflected in GC info. Either ensure GC refs are never enregistered for Wasm, or add a Wasm-specific strategy to report/enforce where GC refs live at safe points (stack home or encoded register slots).
        if (varDsc->lvIsInReg())
        {
#ifndef TARGET_WASM
            // If this variable is going live in a register, it is no longer live on the stack,
            // unless it is an EH/"spill at single-def" var, which always remains live on the stack.
            if (!varDsc->IsAlwaysAliveInMemory())
            {

Comment thread src/coreclr/inc/gcinfotypes.h
Comment on lines +1053 to 1057
#endif // !TARGET_WASM
// Update the gcVarPtrSetCur if it is in memory.
if (isInMemory && (isGCRef || isByRef))
{
VarSetOps::RemoveElemD(this, codeGen->gcInfo.gcVarPtrSetCur, deadVarIndex);
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On TARGET_WASM the register-handling block above is compiled out, but the logic below only updates gcVarPtrSetCur when isInMemory is true. Since the Wasm reg allocator can enregister TYP_REF/TYP_BYREF locals (see src/coreclr/jit/regalloc.cpp:434-446), isInMemory can be false for GC-tracked locals and they won't be removed from (or added to) gcVarPtrSetCur correctly, producing incorrect stack-root reporting in the encoded GC info. Consider either (1) preventing GC-tracked locals from being enregistered on Wasm, or (2) keeping a real stack home for GC refs and ensuring gcVarPtrSetCur reflects it, or (3) implementing register-slot tracking/encoding for Wasm instead of compiling it out.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it could be a real problem, thoughts @AndyAyersMS @SingleAccretion ? I think out of its proposed options, 3 doesn't make sense and 1 would make our performance terrible. I think the plan was that we'd spill any enregistered gc refs to memory before safe points, yeah? Which is roughly 2. Wondering if this indicates a change I need to make.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be enabling this liveness tracking (EMIT_GENERATE_GCINFO) infrastructure back in any case. It's useless - all the GC locals should be described by untracked (thus live for the entire method) slots anyhow.

As for the spilling and such, yes, we'll need to spill. For now you can disable enregistration of locals that have GC pointers to make it correct (not fully correct though, since SDSDu need to be handled too, but that can be added separately).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree, we aren't going find a use for any of the emitter-driven info for GC tracking.

For an initial PR I think a good goal would be to just enable emission of the GC Info data to the host, without any GC slot reporting of any kind. This should be sufficient for conservative scanning and provide other information the runtime needs for various purposes.

It might also unearth some interesting questions that we'll need to come to terms with sooner or later.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll switch emit_generate_gcinfo back to 0, thank you both!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that with it set to 0, various parts of gcencode etc don't build. I don't think this is a configuration we actually support.

Comment on lines +1584 to +1592
// FIXME: This can erroneously trigger when a method fails compilation late and we
// successfully retry compilation after the failure.
// Debug.Assert(_debugVarInfos == null);

if (cVars == 0)
{
_debugVarInfos = null;
return;
}
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting out these asserts makes it harder to detect real contract violations (setVars/setBoundaries called more than once per compilation). Since CompileMethodInternal can request CompilationRetryRequested without running CompileMethodCleanup (see src/coreclr/tools/Common/JitInterface/CorInfoImpl.cs:322-433), a better fix is to explicitly reset per-attempt state (e.g., _debugVarInfos/_debugLocInfos) when a retry is requested or at the start of each compilation attempt, and keep the asserts enabled.

Copilot uses AI. Check for mistakes.
)

set( JIT_WASM_SOURCES
gcdecode.cpp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more proper way to add these files is by moving them from JIT_NATIVE_TARGET_SOURCES to JIT_SOURCES.

kg added 2 commits April 27, 2026 09:36
Checkpoint

Checkpoint

Comment out some overly sensitive asserts that can be hit when things are working as designed

Funclet start blocks need labels on Wasm for some reason.

Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Cleanup

jit-format

Compile the gc info decoder too, since we'll need it

Add comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants