Skip to content

[cDAC] Add IMetaDataImport COM wrapper over MetadataReader for no-fallback mode#127028

Merged
max-charlamb merged 36 commits intomainfrom
cdac-metadata-import
Apr 27, 2026
Merged

[cDAC] Add IMetaDataImport COM wrapper over MetadataReader for no-fallback mode#127028
max-charlamb merged 36 commits intomainfrom
cdac-metadata-import

Conversation

@max-charlamb
Copy link
Copy Markdown
Member

@max-charlamb max-charlamb commented Apr 16, 2026

Note

This PR description was AI/Copilot-generated.

Summary

Implements a managed [GeneratedComClass] wrapper (MetaDataImportImpl) that adapts System.Reflection.Metadata.MetadataReader to the IMetaDataImport, IMetaDataImport2, and IMetaDataAssemblyImport COM interfaces. This enables SOS and ClrMD to query metadata in cDAC mode, with optional legacy DAC fallback for methods not yet implemented in the managed layer.

Motivation

In cDAC no-fallback mode, ClrDataModule.GetInterface() returned NotHandled for IMetaDataImport QIs when _legacyModulePointer == 0, meaning diagnostic tools couldn't access type/method/field metadata. The cDAC already has access to MetadataReader via the EcmaMetadata contract, so a thin COM wrapper bridges the gap.

When a legacy DAC is available, the wrapper uses it for #if DEBUG validation (asserting that cDAC and DAC produce identical results) and as a fallback for the ~45 methods not yet implemented in managed code.

Changes

File Description
IMetaDataImport.cs Managed [GeneratedComInterface] definitions for IMetaDataImport (51 methods), IMetaDataImport2 (8 methods), IMetaDataAssemblyImport (14 methods), ASSEMBLYMETADATA struct, and internal CldbHResults constants
MetaDataImportImpl.cs [GeneratedComClass] implementation — 28 full cDAC implementations via MetadataReader + ~45 legacy-delegated stubs. Uses explicit interface notation.
OutputBufferHelpers.cs CopyStringToBuffer split into two overloads: void (for callers that don't check truncation) and out bool truncated (for MetaDataImportImpl)
ClrDataModule.cs Wire up wrapper via ICustomQueryInterface — creates MetaDataImportImpl with both MetadataReader and optional legacy IMetaDataImport
MetaDataImportImplTests.cs 50 unit tests using synthetic metadata built with MetadataBuilder
MetaDataImportDumpTests.cs 3 dump-based integration tests verifying semantic parity against real metadata
MultiModule debuggee Test debuggee with non-const fields, user strings, and methods for dump tests

Implemented methods (28 cDAC, ~45 legacy fallback)

Enum (cDAC): EnumInterfaceImpls, EnumFields, EnumGenericParams, CloseEnum, CountEnum, ResetEnum

Properties (cDAC): GetTypeDefProps, GetTypeRefProps, GetMethodProps, GetFieldProps, GetMemberProps, GetInterfaceImplProps, GetNestedClassProps, GetGenericParamProps, GetMemberRefProps, GetModuleRefProps, GetParamProps, GetClassLayout, GetUserString, GetParamForMethodIndex

Blob/token (cDAC): GetRVA, GetSigFromToken, GetTypeSpecFromToken, GetCustomAttributeByName, IsValidToken, FindTypeDefByName

Assembly (cDAC): GetAssemblyProps, GetAssemblyRefProps, GetAssemblyFromScope

Legacy fallback: All remaining methods delegate to _legacyImport/_legacyImport2/_legacyAssemblyImport or return E_NOTIMPL when no legacy is available.

Native parity behaviors

  • <Module> parent mapping: GetMethodProps/GetFieldProps/GetMemberRefProps map TypeDef RID 1 to mdTypeDefNil (0x00000000) via MapGlobalParentToken, matching native RegMeta
  • Constant defaults: GetFieldProps/GetParamProps return ELEMENT_TYPE_VOID when no constant is present
  • User string heap: GetUserString uses raw #US heap byte parsing to exactly match native blob size validation (odd-length check, terminal byte stripping)
  • Assembly flags: GetAssemblyProps ORs afPublicKey into flags when public key blob is non-empty
  • Truncation: All string-returning methods return CLDB_S_TRUNCATION when buffer is too small
  • Record not found: GetNestedClassProps/GetClassLayout/GetAssemblyProps return CLDB_E_RECORD_NOTFOUND for missing records

Design decisions

  • Explicit interface implementation: All ~73 COM methods use explicit interface notation (int IMetaDataImport.Method(...)) to keep the public surface clean
  • ICustomQueryInterface: QI for IMetaDataImport returns an IMetaDataImport2 vtable so callers always get the full interface
  • HCORENUM pattern: Uses GCHandle.Alloc to box MetadataEnum objects; ConcurrentDictionary<nint, byte> tracks ownership for routing CloseEnum/CountEnum/ResetEnum between cDAC and legacy handles
  • #if DEBUG validation: Every cDAC-implemented method (with 2 justified exceptions) cross-checks its output against the legacy DAC in debug builds
  • CopyStringToBuffer overloads: void overload for callers that don't need truncation info; out bool truncated overload for MetaDataImportImpl callers that return CLDB_S_TRUNCATION

Testing

  • 1845 unit tests pass (all cDAC tests including 50 MetaDataImportImpl tests)
  • 214 dump-based tests pass locally (3 are MetaDataImport-specific)
  • Tests cover: enum pagination, global/non-global parent mapping, constant handling, user string char counts, truncation, nested classes, generic params, assembly properties, invalid tokens, QI vtable correctness, and E_NOTIMPL fallback behavior

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a managed COM-compatible IMetaDataImport/IMetaDataImport2 implementation backed by System.Reflection.Metadata.MetadataReader, and wires it into cDAC’s ClrDataModule so diagnostic tools can query metadata when running in no-fallback mode (legacy DAC unavailable).

Changes:

  • Introduces generated COM interface definitions for IMetaDataImport and IMetaDataImport2.
  • Implements MetadataImportWrapper to adapt MetadataReader to the COM metadata import APIs (with a subset implemented and the rest stubbed).
  • Updates ClrDataModule to provide IMetaDataImport in no-fallback mode and adds unit tests validating implemented behaviors.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/native/managed/cdac/tests/MetadataImportWrapperTests.cs Adds unit tests that exercise the wrapper’s enumeration and property APIs over synthetic metadata.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/MetadataImportWrapper.cs Implements the COM wrapper over MetadataReader including enum-handle handling and a set of metadata accessors.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/IMetaDataImport.cs Adds managed [GeneratedComInterface] declarations for IMetaDataImport/IMetaDataImport2 with vtable ordering.
src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/ClrDataModule.cs Hooks IMetaDataImport QI to return the managed wrapper when legacy DAC is unavailable.

Comment thread src/native/managed/cdac/tests/MetadataImportWrapperTests.cs Outdated
@github-actions

This comment has been minimized.

max-charlamb added a commit that referenced this pull request Apr 17, 2026
The ICustomQueryInterface.GetInterface method in ClrDataModule delegates
IMetaDataImport QIs to the legacy module pointer. Gate this with
CanFallback() so no-fallback mode blocks it, allowing PR #127028's
managed MetadataReader wrapper to provide metadata instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
max-charlamb added a commit that referenced this pull request Apr 17, 2026
- Add ICustomQueryInterface.GetInterface to the allowlist (needed until
  managed MetadataReader wrapper lands in PR #127028)
- Replace file-based logging with Console.Error.WriteLine so blocked
  fallback calls appear directly in test output (captured by ProcessRunner)
- Simplify tracking to ConcurrentDictionary<string, bool> with TryAdd
- Remove Flush() CanFallback gate (cache management, not data retrieval)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jkoritzinsky
Copy link
Copy Markdown
Member

You'll likely need to add support for IMetaDataAssemblyImport as well for assembly-level info

max-charlamb added a commit that referenced this pull request Apr 17, 2026
The ICustomQueryInterface.GetInterface method in ClrDataModule delegates
IMetaDataImport QIs to the legacy module pointer. Gate this with
CanFallback() so no-fallback mode blocks it, allowing PR #127028's
managed MetadataReader wrapper to provide metadata instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
max-charlamb added a commit that referenced this pull request Apr 17, 2026
- Add ICustomQueryInterface.GetInterface to the allowlist (needed until
  managed MetadataReader wrapper lands in PR #127028)
- Replace file-based logging with Console.Error.WriteLine so blocked
  fallback calls appear directly in test output (captured by ProcessRunner)
- Simplify tracking to ConcurrentDictionary<string, bool> with TryAdd
- Remove Flush() CanFallback gate (cache management, not data retrieval)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 17, 2026 17:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

Comment thread src/native/managed/cdac/tests/MetaDataImportImplTests.cs Outdated
Comment thread src/native/managed/cdac/tests/MetaDataImportImplTests.cs Outdated
Comment thread src/native/managed/cdac/tests/MetaDataImportImplTests.cs Outdated
Comment thread src/native/managed/cdac/tests/MetaDataImportImplTests.cs
max-charlamb added a commit that referenced this pull request Apr 17, 2026
> [!NOTE]
> This PR description was generated with the assistance of GitHub
Copilot.

## Summary

Add a granular, per-method allowlist (`LegacyFallbackHelper`) that
controls which delegation-only APIs may fall back to the legacy DAC when
`CDAC_NO_FALLBACK=1` is set. This enables selective no-fallback testing
— blocking fallback for most APIs while allowing specific APIs that are
known to not yet be implemented in the cDAC.

Wire this into the runtime-diagnostics CI pipeline using the
`-noFallback` flag from
[dotnet/diagnostics#5806](dotnet/diagnostics#5806).

All fallback attempts (both allowed and blocked) are logged to stderr
with method name, file, and line number for capture by the diagnostics
test infrastructure.

## Changes

### LegacyFallbackHelper.cs — Granular fallback control

New static helper that every delegation-only call site invokes via
`CanFallback()`. Uses `[CallerMemberName]`, `[CallerFilePath]`, and
`[CallerLineNumber]` to identify the call site.

- **Normal mode** (`CDAC_NO_FALLBACK` unset): Always returns `true`
(single `bool` check, `[AggressiveInlining]`)
- **No-fallback mode** (`CDAC_NO_FALLBACK=1`): Checks method name
against a `HashSet<string>` allowlist and file name against a file-level
allowlist

**Per-method allowlist:**

| Method | Reason |
|--------|--------|
| `EnumMemoryRegions` | Dump creation — cDAC has no memory enumeration
implementation |
| `GetInterface` | IMetaDataImport QI ([PR
#127028](#127028)) |
| `GetMethodDefinitionByToken` | IXCLRDataModule — not yet implemented
in cDAC |
| `IsTrackedType` | GC heap analysis ([PR
#125895](#125895)) |
| `TraverseLoaderHeap` | Loader heap traversal ([PR
#125129](#125129)) |

**File-level allowlist:**

| File | Reason |
|------|--------|
| `DacDbiImpl.cs` | Entire DBI/ICorDebug interface (122 methods) —
deferred |

### Entrypoints.cs — Simplified creation

Both `CreateSosInterface` and `CreateDacDbiInterface` now follow the
same pattern: the legacy implementation is always passed through, and
`LegacyFallbackHelper.CanFallback()` at each call site decides whether
to delegate. Removed `prevent_release`, `noFallback` env var check, and
null-legacy-ref logic.

### 13 Legacy wrapper files — Instrumented delegation sites

All 296 delegation-only methods across all legacy wrapper files now call
`LegacyFallbackHelper.CanFallback()`:

- `SOSDacImpl.cs` (12 methods)
- `SOSDacImpl.IXCLRDataProcess.cs` (38 methods, `Flush()` intentionally
excluded — cache management)
- `ClrDataModule.cs` (29 methods + IMetaDataImport QI)
- `DacDbiImpl.cs` (122 methods)
- Other wrappers: `ClrDataTask.cs`, `ClrDataExceptionState.cs`,
`ClrDataFrame.cs`, `ClrDataValue.cs`, `ClrDataTypeInstance.cs`,
`ClrDataMethodInstance.cs`, `ClrDataStackWalk.cs`, `ClrDataProcess.cs`

### CI Pipeline — `-noFallback` flag

Updated `runtime-diag-job.yml` to accept a `noFallback` parameter that
passes `-noFallback` to the diagnostics build script. The
`cDAC_no_fallback` leg in `runtime-diagnostics.yml` now uses
`noFallback: true` instead of setting `CDAC_NO_FALLBACK` as a
pipeline-level environment variable. The `-noFallback` flag (from
[dotnet/diagnostics#5806](dotnet/diagnostics#5806))
properly:

- Sets `DOTNET_ENABLE_CDAC=1` and `CDAC_NO_FALLBACK=1` on the debugger
process
- Defines `CDAC_NO_FALLBACK_TESTING` to skip `ClrStack -i` tests
(ICorDebug not implemented in cDAC)

### Stderr logging

Every fallback attempt is logged to stderr in the format:

```
[cDAC] Allowed fallback: CreateStackWalk at DacDbiImpl.cs:590
[cDAC] Blocked fallback: SomeMethod at SOSDacImpl.cs:123
```

The diagnostics test infrastructure (`ProcessRunner`) captures stderr
and routes it to xunit test output with `STDERROR:` prefix, making
fallback usage visible in test results.

## Test Results

With `CDAC_NO_FALLBACK=1` and the current allowlist, running the full
SOS test suite against a private runtime build:

- **24 passed**, **2 failed** (flaky/pre-existing), **2 skipped**
(Linux-only)
- **0 blocked fallbacks**

## Motivation

The existing cDAC test leg always has the legacy DAC as a fallback, so
unimplemented APIs are silently handled. The granular no-fallback mode
makes gaps visible per-method, helping track progress toward full cDAC
coverage while keeping tests green for known-deferred APIs.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 17, 2026 21:31
@max-charlamb max-charlamb force-pushed the cdac-metadata-import branch from 8f31425 to a67d391 Compare April 17, 2026 21:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

@github-actions

This comment has been minimized.

Copilot AI review requested due to automatic review settings April 18, 2026 02:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.

Comment thread src/native/managed/cdac/tests/MetaDataImportImplTests.cs
Comment thread src/native/managed/cdac/tests/MetaDataImportImplTests.cs Outdated
Comment thread src/native/managed/cdac/tests/DumpTests/Debuggees/MultiModule/Program.cs Outdated
@max-charlamb max-charlamb force-pushed the cdac-metadata-import branch from bbbd537 to f5b1347 Compare April 18, 2026 15:55
Copilot AI review requested due to automatic review settings April 19, 2026 00:57
Max Charlamb and others added 19 commits April 24, 2026 15:03
Native IMetaDataImport returns CLDB_S_TRUNCATION (0x00131106) when a
string buffer is smaller than the required size. Match this behavior by:

- Change CopyStringToBuffer return type from void to bool (true = truncated)
- Add CLDB_S_TRUNCATION constant to MetaDataImportImpl
- Update all 9 CopyStringToBuffer callers to set hr appropriately
- Update GetAssemblyProps/GetAssemblyRefProps manual string copy with
  truncation detection for both name and locale strings
- Update test to expect CLDB_S_TRUNCATION on small buffer

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Return ELEMENT_TYPE_VOID (1) instead of 0 for pdwCPlusTypeFlag when
  no constant exists in GetFieldProps and GetParamProps
- Divide pchValue by sizeof(WCHAR) for ELEMENT_TYPE_STRING constants
  in GetFieldProps and GetParamProps
- Map global <Module> parent (TypeDef RID 1) to mdTypeDefNil (0) in
  GetMethodProps, GetFieldProps, and GetMemberRefProps
- Fix GetUserString pchString to return character count without null
  terminator (matching native userString.GetSize() / sizeof(WCHAR))
- OR afPublicKey (0x0001) into assembly flags when public key blob is
  non-empty in GetAssemblyProps

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- GetFieldProps/GetParamProps: verify ELEMENT_TYPE_VOID (1) when no constant
- GetFieldProps: verify string constant returns char count, not byte count
- GetMethodProps: verify global method on <Module> returns mdTypeDefNil
- GetFieldProps: verify global field on <Module> returns mdTypeDefNil
- GetMethodProps: verify non-global method returns correct parent class
- GetUserString: verify char count without null terminator
- GetAssemblyProps: verify afPublicKey flag when public key blob is non-empty
- GetParamProps: verify ELEMENT_TYPE_VOID when no constant
- Enhanced test metadata with global method, string constant, and public key

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- GetFieldProps: verify ELEMENT_TYPE_VOID when field has no constant
- GetMethodProps: verify non-global method returns correct parent TypeDef
- GetMethodProps: verify global method on <Module> returns mdTypeDefNil
- GetUserString: verify char count without null terminator
- Add InternalsVisibleTo for DumpTests in Legacy csproj
- Remove duplicate HResults.cs compile from DumpTests csproj

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Rewrite GetUserString to use raw #US heap bytes via BlobReader instead
  of MetadataReader.GetUserString() to match native semantics exactly
  (blob size validation, TruncateBySize(1), CLDB_E_FILE_CORRUPT check)
- Fix CLDB_S_TRUNCATION: null-terminate only on truncation (last char),
  matching native copy semantics
- Fix ClrMD AccessViolationException in EnumGenericParams: ClrMD QIs for
  IMetaDataImport but accesses IMetaDataImport2 vtable slots beyond the
  IMetaDataImport boundary. With native C++ COM objects the vtable is
  unified, but [GeneratedComInterface] CCWs have separate per-interface
  vtables. Return IMetaDataImport2 vtable when asked for IMetaDataImport.
- Handle direct QI for IMetaDataImport2 and IMetaDataAssemblyImport in
  ClrDataModule.GetInterface

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ClrMD's CallableCOMWrapper performs a double QueryInterface: first QI on
ClrDataModule for IMetaDataImport, then a second QI on the returned pointer
for IMetaDataImport again. With [GeneratedComInterface] CCWs, each COM
interface gets its own vtable. The second QI was returning the shorter
IMetaDataImport vtable (65 slots), but ClrMD accesses slot 65
(EnumGenericParams from IMetaDataImport2), causing an AccessViolation.

Fix: implement ICustomQueryInterface on MetaDataImportImpl to redirect
IMetaDataImport QIs to IMetaDataImport2, providing the full 73-slot vtable.
Store the StrategyBasedComWrappers instance on MetaDataImportImpl during
CCW creation in ClrDataModule.GetInterface.

Add regression test that reproduces the double-QI scenario and verifies
EnumGenericParams is callable through the redirected vtable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add HasReader property with [MemberNotNullWhen(true, nameof(_reader))] so
the compiler tracks _reader as non-null after the guard check. Replace all
'if (_reader is null)' guards with 'if (!HasReader)' and remove 57 uses of
the null-forgiving operator (_reader!).

Private helpers (GetTypeDefFullName, GetTypeRefFullName, BuildInterfaceImplLookup,
GetCustomAttributeTypeName) use Debug.Assert(HasReader) to satisfy flow analysis
since they are only called from contexts where _reader is known non-null.

Also fix MetaDataImportDumpTests to assign Assert.NotNull result for proper
nullable flow analysis.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Changed _reader from MetadataReader? to MetadataReader (non-nullable)
- Removed HasReader property and MemberNotNullWhen attribute
- Removed all 26 if (!HasReader) fallback blocks from implemented methods
- ClrDataModule now returns NotHandled if reader is null (fail fast)
- Removed NullReader_* tests since null reader is no longer valid

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The dump tests use xUnit v2 where Assert.NotNull returns void, unlike
the unit tests which use xUnit v3 where it returns T. Split the assert
and assignment into separate statements.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…am lookup

- Fix CountEnum/ResetEnum to work with cDAC-created enum handles by
  tracking GCHandle values in a HashSet. Previously these delegated to
  legacy which couldn't interpret cDAC GCHandle-based enums.
- Fix CloseEnum to distinguish cDAC vs legacy enum handles, preventing
  crashes when mixing handle types.
- Implement GetClassLayout field offset population. Previously always
  returned pcFieldOffset=0; now fills COR_FIELD_OFFSET array with
  field tokens and their explicit layout offsets.
- Optimize GetParamProps parent method lookup from O(N^2) triple-nested
  loop to O(1) cached dictionary lookup, matching the existing pattern
  used by BuildInterfaceImplLookup.
- Add tests for CountEnum, ResetEnum, and null handle edge cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…mbly import methods

- Replace HashSet<nint> with ConcurrentDictionary<nint, byte> for _cdacEnumHandles
  to support concurrent COM method calls from multiple threads.
- Refactor GetAssemblyProps and GetAssemblyRefProps to use
  OutputBufferHelpers.CopyStringToBuffer instead of manual string copy,
  reducing code duplication and ensuring consistent buffer handling.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add ContractsDictionaryConverter that handles both string and integer
values in the contracts dictionary. Integer values from older runtimes
are mapped to the 'c<N>' naming convention (e.g., 1 -> 'c1') to match
the current string-based contract version registrations.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Only accept IID_IMetaDataImport in ClrDataModule QI (not IMetaDataImport2/AssemblyImport)
- Replace StrategyBasedComWrappers with ComInterfaceMarshaller pattern
- Fix unused userStringHandle variable in tests
- Fix StringConst field to use string field signature instead of int

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- ConvertToUnmanaged already returns an AddRef'd interface pointer, so
  the follow-up Marshal.QueryInterface was unnecessary in both
  ClrDataModule and MetaDataImportImpl ICustomQueryInterface.
- Remove null-forgiving operator on ConvertToManaged since legacyImport
  is already nullable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move field declarations above constructor in ClrDataModule
- Remove stale ComWrappers and ICustomQueryInterface comments
- Move CLDB_* HResults to shared CorDbgHResults.cs
- Use CorElementType enum instead of raw uint constants
- Move validation helpers to bottom of MetaDataImportImpl
- Make GetEnum validate ownership and return non-nullable
- Convert all public methods to explicit interface notation
- Simplify CountEnum/ResetEnum using validated GetEnum
- Update tests for explicit interface notation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The explicit interface conversion missed MetaDataImportDumpTests.cs,
which was calling methods directly on MetaDataImportImpl. Changed
GetRootModuleImport() return type to IMetaDataImport.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove unused System.Runtime.CompilerServices using in MultiModule debuggee
- Fix comment: 'ridOfField' -> 'FieldDef token' in GetClassLayout

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Split CopyStringToBuffer into two overloads:
- void CopyStringToBuffer(char*, uint, uint*, string) for callers that
  don't need truncation info (SOSDacImpl, ClrDataModule, etc.)
- void CopyStringToBuffer(char*, uint, uint*, string, out bool truncated)
  for MetaDataImportImpl callers that check truncation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@max-charlamb max-charlamb force-pushed the cdac-metadata-import branch from 46f0cdc to 817457d Compare April 24, 2026 19:07
@github-actions
Copy link
Copy Markdown
Contributor

🤖 Copilot Code Review — PR #127028

Note

This review was AI-generated by GitHub Copilot using multi-model analysis (Claude Opus 4.6 primary, with Claude Sonnet 4.5 and GPT-5.3-Codex sub-agents).

Holistic Assessment

Motivation: Well-justified. A managed MetaDataImportImpl wrapping System.Reflection.Metadata.MetadataReader is essential for cDAC no-fallback mode where the legacy DAC is unavailable. SOS and ClrMD depend on IMetaDataImport for metadata queries, making this a required capability.

Approach: Sound. Implementing ~30 commonly-used methods with MetadataReader while delegating ~30 rarely-used ones to the legacy DAC is pragmatic. The ICustomQueryInterface vtable redirect to handle ClrMD's beyond-vtable IMetaDataImport2 slot access is clever and well-documented. The extensive #if DEBUG validation blocks comparing output against the legacy DAC are excellent for catching parity gaps during development.

Summary: ⚠️ Needs Human Review. The code is well-structured, follows cDAC conventions, and handles COM interop correctly. There are a few thread-safety concerns with lazy initialization and a minor correctness gap in IsValidToken that warrant maintainer attention. None appear to be critical bugs in practice, but a human reviewer should evaluate whether they're acceptable for the diagnostics use case.


Detailed Findings

⚠️ IsValidToken — RID 0 early rejection incorrectly covers user string tokens

MetaDataImportImpl.cs:819-838IsValidToken rejects RID 0 before checking the token type:

int rid = (int)(tk & 0x00FFFFFF);
int tokenType = (int)(tk >> 24);

if (rid == 0)
    return 0; // FALSE — but this runs before the UserString branch

For metadata table tokens, RID 0 is nil/invalid. However, for user string tokens (0x70), the "RID" is actually a heap offset, and offset 0 is a valid entry in the #US heap (the required null/empty byte per ECMA-335). The native IsValidToken would return TRUE for 0x70000000 when the heap is non-empty, while this implementation returns FALSE.

Fix: Move the rid == 0 check after the UserStringTokenType branch, or restrict it to table tokens only:

const int UserStringTokenType = 0x70;
if (tokenType == UserStringTokenType)
{
    int heapSize = _reader.GetHeapSize(HeapIndex.UserString);
    return rid < heapSize ? 1 : 0;
}

if (rid == 0)
    return 0; // Only for table tokens

Advisory — unlikely to cause real-world issues since offset 0 is never referenced by IL, but it is a parity gap with native. Flagged by GPT-5.3-Codex, verified by primary reviewer.

⚠️ Lazy dictionary initialization without memory barriers (thread safety)

MetaDataImportImpl.cs:554,1336_interfaceImplToTypeDef and _paramToMethod use ??= for lazy initialization:

_interfaceImplToTypeDef ??= BuildInterfaceImplLookup();

Since MetaDataImportImpl is a COM object callable from multiple threads (as acknowledged by the use of ConcurrentDictionary for _cdacEnumHandles), two threads could race on first access. The dictionaries are fully constructed before assignment and reference writes are atomic in .NET, so the practical risk is very low. However, the codebase convention states that ??= is not thread-safe and cross-thread field access should use Volatile or Interlocked.

Volatile.Write/Volatile.Read or LazyInitializer.EnsureInitialized would address this cleanly.

Advisory — not merge-blocking. The benign-race pattern here (two threads build independent dictionaries, one wins, both are equivalent) is safe in practice on x86/x64 and very likely safe on ARM given .NET's runtime guarantees. Flagged by all three models.

⚠️ GCHandle leak if CloseEnum is never called

MetaDataImportImpl.cs:66-73AllocEnum allocates GCHandle instances tracked in _cdacEnumHandles, but there's no finalizer or IDisposable to clean them up if the COM consumer forgets to call CloseEnum. While this matches the native IMetaDataImport contract (callers own enum lifetime), diagnostic tools may not always clean up on error paths.

Advisory — follows the same contract as native. A finalizer iterating _cdacEnumHandles.Keys and freeing outstanding handles would be a defensive improvement. Not merge-blocking since the native API has the same contract.

✅ COM ref counting — Correct

Both ICustomQueryInterface.GetInterface implementations (in MetaDataImportImpl.cs:47 and ClrDataModule.cs:111) correctly use ComInterfaceMarshaller<IMetaDataImport2>.ConvertToUnmanaged() which returns an already-AddRef'd pointer, consistent with COM QueryInterface semantics. The IMetaDataImport→IMetaDataImport2 vtable redirect for ClrMD compatibility is validated end-to-end by the test at line 964.

GetUserString raw heap parsing — Correct

MetaDataImportImpl.cs:1188-1250 — Correctly reads from raw #US heap bytes rather than using MetadataReader.GetUserString(), avoiding potential discrepancies. The odd-size validation (blobSize % sizeof(char) == 0 → corrupt) correctly enforces the terminal byte requirement per ECMA-335 §II.24.2.4. pchString correctly returns character count without null terminator, matching native RegMeta semantics.

✅ Empty catch blocks in ClrDataModule.GetInterface — Acceptable

The two bare catch {} blocks (lines 85-87, 98-100) silently swallow exceptions when probing for metadata reader availability and legacy QI support. This is appropriate: both are optional capability probes. If both fail, the method returns CustomQueryInterfaceResult.NotHandled (line 103), which is the correct degradation path.

✅ DEBUG validation blocks — Excellent

The #if DEBUG blocks throughout MetaDataImportImpl comparing cDAC output against legacy DAC output (via Debug.ValidateHResult and per-field assertions) are a strong engineering practice for ensuring semantic parity. The ValidateBlobsEqual helper for byte-level blob comparison is thorough.

✅ Test coverage — Comprehensive

~1100 lines of unit tests covering: enum pagination, string truncation with CLDB_S_TRUNCATION, not-found error paths, GetMemberProps dispatch (method vs field table index), global method parent mapping (MapGlobalParentToken), constant value handling for both fields and params, vtable slot verification for the ClrMD compatibility fix, and enum lifecycle (CountEnum, ResetEnum, CloseEnum). The dump-based integration tests validate real-world metadata against actual process dumps.

💡 Static _testProvider field in tests

MetaDataImportImplTests.cs:161_testProvider is a shared static field used to prevent GC collection. Consider [ThreadStatic] for parallel test safety. Low practical risk since xUnit serializes tests within a class.

💡 _metaDataImportImpl lazy init race in ClrDataModule

ClrDataModule.cs:107_metaDataImportImpl ??= wrapper is a documented benign race. Both wrappers are functionally equivalent, the extra one is GC'd, and line 108 re-reads to ensure consistency. Acceptable pattern.

Generated by Code Review for issue #127028 ·

Copy link
Copy Markdown
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits, but overall looks good to me!

Copilot AI review requested due to automatic review settings April 27, 2026 20:21
@max-charlamb
Copy link
Copy Markdown
Member Author

/ba-g previous test run passed. only nit style changes from comments

@max-charlamb max-charlamb merged commit 23d2e12 into main Apr 27, 2026
18 of 32 checks passed
@max-charlamb max-charlamb deleted the cdac-metadata-import branch April 27, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants