Refactor ObjectDataInterner to support pluggable deduplicators and enable for ReadyToRun#126112
Conversation
Extract method-body-specific deduplication logic from ObjectDataInterner into a new MethodBodyDeduplicator class behind an IObjectDataDeduplicator interface. ObjectDataInterner now accepts a set of deduplicator instances and delegates the deduplication passes to them. - Add IObjectDataDeduplicator interface with DeduplicatePass method - Move MethodInternKey, MethodInternComparer, and CanFold logic into MethodBodyDeduplicator - Add virtual CanFold method on NodeFactory (returns false by default) - Override CanFold in RyuJitNodeFactory to delegate to MethodBodyDeduplicator Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move ObjectDataInterner.cs (with IObjectDataDeduplicator interface) from ILCompiler.Compiler/Compiler/ to Common/Compiler/ so it can be shared across AOT projects. Add it to ILCompiler.ReadyToRun.csproj. The MethodBodyDeduplicator implementation stays in ILCompiler.Compiler. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a new IObjectDataDeduplicator implementation that deduplicates CopiedMethodILNode instances based on their IL body bytes. Uses a HashSet with a custom key/comparer to correctly handle hash collisions. - Add Values property to ReadyToRun NodeCache for enumeration - Add ObjectDataInterner to ReadyToRun NodeFactory - Wire up CopiedMethodILDeduplicator in NodeFactory constructor Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the three #if !READYTORUN / #if READYTORUN guards around ObjectInterner.GetDeduplicatedSymbol calls in the common ObjectWriter. Now that ReadyToRun's NodeFactory has an ObjectDataInterner, the deduplication logic applies uniformly to both NativeAOT and ReadyToRun. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
cc @kotlarmilos @jkotas here's an alternative implementation of #126047 based on ObjectDataInterner. It also enables ObjectDataInterner for R2R so we could eventually add more deduplicators easily if we so desired. |
There was a problem hiding this comment.
Pull request overview
Refactors the object-data deduplication infrastructure so ObjectDataInterner can run one or more pluggable deduplication strategies, and wires this up for both NativeAOT (method body folding) and ReadyToRun (copied IL folding).
Changes:
- Introduces
IObjectDataDeduplicatorand a sharedObjectDataInternerinsrc/coreclr/tools/Common/Compiler/. - Extracts NativeAOT method-body folding into
MethodBodyDeduplicatorand routes folding decisions throughNodeFactory.CanFold. - Enables ReadyToRun deduplication by adding
CopiedMethodILDeduplicatorand anObjectInterneron the R2RNodeFactory, plus removesREADYTORUNguards in the commonObjectWriter.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/tools/aot/ILCompiler.RyuJit/Compiler/RyuJitCompilationBuilder.cs | Creates MethodBodyDeduplicator based on folding mode and builds an ObjectDataInterner from it. |
| src/coreclr/tools/aot/ILCompiler.RyuJit/Compiler/DependencyAnalysis/RyuJitNodeFactory.cs | Plumbs MethodBodyDeduplicator into the factory and overrides CanFold. |
| src/coreclr/tools/aot/ILCompiler.ReadyToRun/ILCompiler.ReadyToRun.csproj | Links shared ObjectDataInterner and adds the new R2R deduplicator source. |
| src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs | Adds cache enumeration (Values) and instantiates an ObjectInterner for R2R. |
| src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/CopiedMethodILDeduplicator.cs | Implements R2R strategy to fold CopiedMethodILNode instances by raw IL bytes. |
| src/coreclr/tools/aot/ILCompiler.Compiler/ILCompiler.Compiler.csproj | Moves ObjectDataInterner to shared location and adds MethodBodyDeduplicator. |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/MethodBodyDeduplicator.cs | Extracts method-body folding logic into a dedicated IObjectDataDeduplicator. |
| src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs | Replaces direct ObjectInterner.CanFold usage with overridable CanFold. |
| src/coreclr/tools/Common/Compiler/ObjectWriter/ObjectWriter.cs | Runs deduplication uniformly (including READYTORUN builds) when emitting nodes/relocs. |
| src/coreclr/tools/Common/Compiler/ObjectDataInterner.cs | Adds shared pluggable interner + interface for iterative convergence passes. |
- Guard GetDeduplicatedSymbol call in ObjectWriter when symbolNode is null - Cache IL bytes in InternKey to avoid repeated PE reads during equality checks - Use Func<IEnumerable<CopiedMethodILNode>> for deferred enumeration so the deduplicator sees the final cache contents after marking is complete Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ObjectDataInterner: Replace count-only convergence check with full dictionary content comparison to detect mapping changes even when count stays the same. - CopiedMethodILDeduplicator: Cache deduplication results to avoid repeated GetData allocations across convergence loop iterations. - CopiedMethodILDeduplicator: Sort nodes with CompilerComparer before iterating to ensure deterministic canonical representative selection regardless of ConcurrentDictionary enumeration order. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eduplicator.cs Co-authored-by: Milos Kotlar <kotlarmilos@gmail.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/MethodBodyDeduplicator.cs:61
symbolRemapping.TryAdd(body, found.Method)will silently ignore the insertion if some other deduplicator (or an earlier duplicate) already added a mapping for the samebody. With pluggable deduplicators, conflicting mappings are a real possibility and silently dropping one can lead to non-deterministic or partially-applied deduplication. Consider usingAdd(to fail fast) or explicitly detecting an existing mapping and asserting/throwing if it differs fromfound.Method.
Note
This PR was generated with the help of GitHub Copilot.
Summary
Refactors
ObjectDataInternerto accept a set of pluggableIObjectDataDeduplicatorstrategies instead of hard-coding method body deduplication logic, then extends the infrastructure to ReadyToRun.Changes
Pluggable deduplicator interface
IObjectDataDeduplicatorwith a singleDeduplicatePassmethod.ObjectDataInternernow acceptsparams IObjectDataDeduplicator[]and delegates each convergence-loop iteration to them.ObjectDataInternertoCommon/Compiler/so both NativeAOT and ReadyToRun can share it.Method body deduplication (NativeAOT)
MethodDesc/IMethodBodyNode-specific logic (CanFold,MethodInternKey,MethodInternComparer) into a newMethodBodyDeduplicator : IObjectDataDeduplicatorclass.virtual bool CanFold(MethodDesc)on the baseNodeFactory(returnsfalse), overridden inRyuJitNodeFactoryto delegate toMethodBodyDeduplicator.Copied method IL deduplication (ReadyToRun)
CopiedMethodILDeduplicator : IObjectDataDeduplicatorinILCompiler.ReadyToRunthat deduplicatesCopiedMethodILNodeinstances by comparing their raw IL bytes.ObjectDataInternerproperty to the ReadyToRunNodeFactory, wired up with the new deduplicator.Valuesproperty to the ReadyToRunNodeCache<TKey, TValue>to enable enumeration.#if !READYTORUNguards aroundObjectInterner.GetDeduplicatedSymbolcalls in the commonObjectWriter, so deduplication now runs uniformly for both pipelines.