Skip to content

Port 126929 to 10.0#126977

Merged
JulieLeeMSFT merged 2 commits intodotnet:release/10.0from
mangod9:fix/gc-largepages-10.0
Apr 29, 2026
Merged

Port 126929 to 10.0#126977
JulieLeeMSFT merged 2 commits intodotnet:release/10.0from
mangod9:fix/gc-largepages-10.0

Conversation

@mangod9
Copy link
Copy Markdown
Member

@mangod9 mangod9 commented Apr 16, 2026

Fixes #126903

Customer Impact

  • Customer reported
  • Found internally

GC heap corruption when DOTNET_GCLargePages=1 is enabled on Linux (#126903). . Reproducible by calling GC.Collect(2, GCCollectionMode.Aggressive, true, true) with large pages enabled, but also occurs in normal production workloads without aggressive GC.

Regression

  • Yes
  • No

This is a pre-existing bug in the GC's large-page decommit logic. When GCLargePages is enabled, the GC skips OS-level
decommits but still updates bookkeeping as if the decommit succeeded. This causes regions to be reused without being zeroed, leading to heap corruption. The bug has existed since Regions was enabled.

Testing

The fix was validated by the customer against their production workload.

Risk

Low. The fix clears decommitted memory in the large-pages scenario to ensure regions are properly zeroed before reuse. This is a targeted change to the GC's decommit path that only affects GCLargePages=1 configurations. The larger fix #127290 is made in .NET 11

Copilot AI review requested due to automatic review settings April 16, 2026 00:23
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports the fix for #126903 to the 10.0 branch by ensuring memory that is logically decommitted while GCLargePages is enabled is also cleared, preventing stale object references from being observed when the region is later reused.

Changes:

  • Extends gc_heap::virtual_decommit with an optional end_of_data parameter to allow clearing memory when OS decommit is a no-op under large pages.
  • In the aggressive-induced GC path, passes heap_segment_used(region) so the GC clears stale reference-containing bytes when shrinking heap_segment_committed.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/coreclr/gc/gcpriv.h Updates the virtual_decommit declaration to accept an optional end_of_data pointer.
src/coreclr/gc/gc.cpp Implements large-page clearing in virtual_decommit and wires end_of_data from distribute_free_regions() for aggressive-induced decommits.

Comment thread src/coreclr/gc/gc.cpp
@BenV
Copy link
Copy Markdown

BenV commented Apr 28, 2026

@mangod9 Thanks again for helping out with this - is this still on track to get merged for the June release? Would it also make sense to backport #127290?

@mangod9
Copy link
Copy Markdown
Member Author

mangod9 commented Apr 28, 2026

@mangod9 Thanks again for helping out with this - is this still on track to get merged for the June release? Would it also make sense to backport #127290?

So just checking that your prod test went well too? Current thinking is to back port only the targeted fix.

@BenV
Copy link
Copy Markdown

BenV commented Apr 28, 2026

@mangod9 Thanks again for helping out with this - is this still on track to get merged for the June release? Would it also make sense to backport #127290?

So just checking that your prod test went well too? Current thinking is to back port only the targeted fix.

Yes, so far so good on the production test, will keep you posted if we run into anything on that front.

@mangod9
Copy link
Copy Markdown
Member Author

mangod9 commented Apr 28, 2026

ok will work on getting this approved for back port. Thanks!

@mangod9 mangod9 requested a review from janvorli April 28, 2026 14:30
@mangod9 mangod9 added the Servicing-consider Issue for next servicing release review label Apr 28, 2026
@JulieLeeMSFT JulieLeeMSFT added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Apr 28, 2026
@JulieLeeMSFT JulieLeeMSFT added this to the 10.0.x milestone Apr 28, 2026
@rbhanda rbhanda modified the milestones: 10.0.x, 10.0.9 Apr 28, 2026
@JulieLeeMSFT
Copy link
Copy Markdown
Member

@neiljohari, @janvorli, @mangod9, please resolve code review comments.

@JulieLeeMSFT JulieLeeMSFT merged commit 5713889 into dotnet:release/10.0 Apr 29, 2026
107 of 108 checks passed
@mangod9
Copy link
Copy Markdown
Member Author

mangod9 commented Apr 29, 2026

@BenV, this is now merged so should be included in June servicing.

@BenV
Copy link
Copy Markdown

BenV commented Apr 29, 2026

@BenV, this is now merged so should be included in June servicing.

Excellent, thanks again for all the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-GC-coreclr Servicing-approved Approved for servicing release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants