[browser/wasi][coreCLR] Dedicated WASM GC PAL — replace mmap with posix_memalign and optimize memory operations#127328
[browser/wasi][coreCLR] Dedicated WASM GC PAL — replace mmap with posix_memalign and optimize memory operations#127328pavelsavara wants to merge 7 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/gc |
There was a problem hiding this comment.
Pull request overview
This PR separates WebAssembly-specific GC OS interface behavior from the shared Unix implementation by introducing a dedicated gcenv.wasm.cpp, and adjusts the PAL virtual memory implementation on WASM to avoid relying on Emscripten’s incomplete mmap/munmap support.
Changes:
- Added a dedicated WASM
GCToOSInterfaceimplementation (gcenv.wasm.cpp) and CMake wiring for building it. - Routed WASM GC builds to the new
gc/wasmdirectory and removed WASM-specific#ifdefpaths fromgcenv.unix.cpp. - Updated PAL virtual memory reserve/release on WASM to use
posix_memalign/freeinstead ofmmap/munmap.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/pal/src/map/virtual.cpp | Switches WASM reserve/release behavior to posix_memalign/free and adjusts related error/cleanup paths. |
| src/coreclr/gc/wasm/gcenv.wasm.cpp | New WASM-specific GC OS interface implementation (virtual memory, CPU/NUMA stubs, memory stats). |
| src/coreclr/gc/wasm/CMakeLists.txt | Adds build definition for the WASM GC PAL object library. |
| src/coreclr/gc/unix/gcenv.unix.cpp | Removes WASM-specific branches and fixes nanosleep EINTR retry logic. |
| src/coreclr/gc/CMakeLists.txt | Routes WASM builds to gc/wasm instead of gc/unix. |
0e592a4 to
2010bf5
Compare
|
I processed your feedback and updated the PR description above |
| // On WASM, reserve == commit. If this range was previously decommitted, | ||
| // sentinels were placed at each page boundary. Check the first byte and | ||
| // zero the entire range if needed. | ||
| #ifdef FEATURE_MULTITHREADING | ||
| // Under MT, VirtualDecommit already zeroes the full range on decommit, and | ||
| // reserve already zeroes on allocation - so commit is a no-op. | ||
| (void)MemSize; | ||
| #else | ||
| if (MemSize && *(BYTE*)StartBoundary != 0) | ||
| { | ||
| ZeroMemory((LPVOID) StartBoundary, MemSize); | ||
| } | ||
| #endif | ||
| #endif |
There was a problem hiding this comment.
On TARGET_WASM, VIRTUALCommitMemory zeroes the entire aligned range whenever the first byte is non-zero. This breaks VirtualAlloc semantics for recommitting already committed memory: PAL test filemapping_memmgt/VirtualAlloc/test20 explicitly writes a value, calls VirtualAlloc(ptr, ..., MEM_COMMIT, ...) again, and expects the contents to be unchanged. With the current check (*(BYTE*)StartBoundary != 0) this will incorrectly zero valid committed data. The commit path needs a way to distinguish ‘recommit after MEM_DECOMMIT’ from ‘commit on already committed pages’ (e.g., always do the zeroing in the MEM_DECOMMIT path and make commit a no-op, or track decommitted state out-of-band instead of inspecting user data bytes).
| // On WASM, reserve == commit. If this range was previously decommitted, | |
| // sentinels were placed at each page boundary. Check the first byte and | |
| // zero the entire range if needed. | |
| #ifdef FEATURE_MULTITHREADING | |
| // Under MT, VirtualDecommit already zeroes the full range on decommit, and | |
| // reserve already zeroes on allocation - so commit is a no-op. | |
| (void)MemSize; | |
| #else | |
| if (MemSize && *(BYTE*)StartBoundary != 0) | |
| { | |
| ZeroMemory((LPVOID) StartBoundary, MemSize); | |
| } | |
| #endif | |
| #endif | |
| // On WASM, reserve == commit, so MEM_COMMIT must be a no-op here. | |
| // In particular, recommitting already committed memory must preserve its | |
| // contents; using user data (for example, the first byte) to infer prior | |
| // decommit state is incorrect and can spuriously zero valid committed data. | |
| // Any zeroing required to model MEM_DECOMMIT must be handled by the | |
| // decommit path or tracked out-of-band rather than during commit. | |
| (void)MemSize; | |
| #endif |
| return E_OUTOFMEMORY; | ||
| if (use_large_pages_p) | ||
| { | ||
| #ifndef HOST_64BIT |
There was a problem hiding this comment.
I think we want to merge #127290 first. It should make this change simpler.
| // here immediately - no sentinel trick needed, and no races. | ||
| ZeroMemory((LPVOID) StartBoundary, MemSize); | ||
| #else | ||
| // We can't decommit the mapping (MAP_FIXED doesn't work in emscripten), and we can't |
| #else | ||
| static inline | ||
| #endif | ||
| int minipal_getpagesize(void) |
There was a problem hiding this comment.
Can we delete GetOsPageSize and use this everywhere instead (it can be a separate PR)?
Also, it may be nice to cache the value here.
Summary
This PR replaces the WASM GC and PAL virtual-memory paths to use
posix_memalign/freeinstead ofmmap/munmap. Emscripten'smmapimplementation is fundamentally broken for the GC's needs:munmapcannot unmap partial allocations,mmap(PROT_NONE)still consumes linear memory, andMAP_FIXEDdoesn't work correctly. This change introduces a dedicated WASM GC OS interface (gc/wasm/gcenv.cpp) and updates the PAL layer to use the allocator-based approach.Based on runtimelab's NativeAOT-LLVM gcenv.wasm.cpp (PR#1510, PR#3151).
Fixes #121036
Fixes #117813
Fixes #118943
Motivation
The CoreCLR GC on WASM targets (browser and WASI) previously shared the Unix
gcenv.unix.cppcode path with#ifdef TARGET_WASMcarve-outs. This was problematic because:mmap/munmapsemantics don't work on WASM — Emscripten'smunmapcannot release partial mappings,mmap(PROT_NONE)reserves real linear memory (no lazy commit), andMAP_FIXEDis broken.memory.growgranularity is 64KB, but the GC works best with a 16KB page size for alignment and thresholds.Changes
New files
src/coreclr/gc/wasm/gcenv.cpp— Complete WASM-specificGCToOSInterfaceimplementation usingposix_memalign/free. Includes ansbrkoptimization that avoids unnecessarymemsetzeroing on freshmemory.growpages (safe because single-threaded WASM has no concurrentsbrkcalls).src/coreclr/gc/wasm/CMakeLists.txt— Build file for the WASM GC PAL module.src/native/minipal/wasm.h— Cross-platformminipal_getpagesize()helper that returns 16KB on WASM instead of the 64KBmemory.growgranularity.Modified files
src/coreclr/gc/CMakeLists.txt— Routes WASM builds to the newwasm/subdirectory instead ofunix/.src/coreclr/gc/init.cpp— Enables the large-pages code path on WASM (to skipVirtualDecommit, since decommit is meaningless on WASM). Caps segment sizes so initial segments fit within the hard limit. Enables auto-detection of the hard limit from WASM linear memory max (treating it like a container memory limit).src/coreclr/gc/interface.cpp— Suppresses the 32-bitassert(!use_large_pages_p)for WASM.src/coreclr/gc/unix/gcenv.unix.cpp— Removes allTARGET_WASM/__EMSCRIPTEN__conditionals (WASM now has its own file). Also fixes a pre-existing bug:nanosleepreturns-1onEINTR, notEINTRitself.src/coreclr/pal/src/map/virtual.cpp— AddsTARGET_WASMpaths inReserveVirtualMemory(usingposix_memalign),VIRTUALCommitMemory(sentinel-based lazy zeroing),VirtualFree(decommit via sentinel orfree), andVIRTUALReserveMemory(usingfreeinstead ofmunmapon error). Restructures theMEM_RELEASEpath to checkmunmapreturn before callingVIRTUALReleaseMemory.src/coreclr/pal/src/misc/sysinfo.cpp— Usesminipal_getpagesize()instead ofgetpagesize().Key design decisions
Sentinel-based lazy zeroing (single-threaded path)
Instead of eagerly zeroing memory on decommit, the single-threaded path writes a non-zero sentinel byte at each page boundary. On recommit, the first byte is checked — if non-zero, the range is zeroed. This avoids double-zeroing in the common case where memory is decommitted and never recommitted.
Under multithreading (
FEATURE_MULTITHREADING), the code falls back to unconditional zeroing on both decommit and commit to avoid races.sbrkoptimizationWhen
posix_memalignreturns memory at or above the previoussbrk(0)break, the allocation came frommemory.growwhich guarantees zero-initialization per the WASM spec. Only recycled blocks below the old break need explicit zeroing. This is safe because WASM is single-threaded (no concurrentsbrkcalls). The same approach is used by Mono's WASM mmap implementation.use_large_pages_p = trueon WASMThe GC's large-pages mode skips
VirtualDecommitfor heap segments, which is exactly what WASM needs since decommit cannot return memory. The hard-limit auto-detection (75% ofemscripten_get_heap_max()) is preserved rather than being tightened to actual segment sizes, leaving room for bookkeeping allocations.16KB page size
WASM
memory.growuses 64KB pages, but the GC's alignment and threshold calculations work better with a finer granularity. The 16KB page size (minipal_getpagesize()) is used for GC page alignment while the 64KBWasmPageSizeconstant is used only when converting__builtin_wasm_memory_sizecounts to bytes.Code review notes
Correctness
nanosleepfix ingcenv.unix.cpp(checking== -1 && errno == EINTRinstead of== EINTR) is a pre-existing bug fix that affects all Unix platforms, not just WASM.MEM_RELEASErestructuring invirtual.cppinverts the error-checking logic fromif (munmap == 0) { if (!Release) fail } else { fail }toif (munmap != 0) fail; if (!Release) fail. This is a behavioral no-change but is clearer.VirtualReseton WASM returnsfalse, forcing the GC to use the decommit+commit fallback path. The previous code returnedtrue(pretendingmadviseworked), which silently did nothing.Thread safety
sbrk-based optimizations are guarded by#ifndef FEATURE_MULTITHREADING.