Skip to content

Extend System.Runtime.Intrinsics.Arm to support nint and nuint (ArmBase + Crc32)#127327

Merged
tannergooding merged 6 commits intomainfrom
copilot/extend-arm-intrinsics-support-nint-nuint
Apr 27, 2026
Merged

Extend System.Runtime.Intrinsics.Arm to support nint and nuint (ArmBase + Crc32)#127327
tannergooding merged 6 commits intomainfrom
copilot/extend-arm-intrinsics-support-nint-nuint

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 23, 2026

Implements the approved API proposal to extend System.Runtime.Intrinsics.Arm with nint/nuint overloads. Per reviewer guidance, this PR is scoped down to Option C: the ArmBase + Crc32 managed surface plus the GenTreeHWIntrinsic auxiliary-type refactor. The bulk of the proposal (AdvSimd + AdvSimd.Arm64) is deferred to follow-up PRs.

Description

Managed surface (implementation, PlatformNotSupported mirror, and ref assembly)

  • ArmBase
    • LeadingZeroCount(nint), LeadingZeroCount(nuint)
    • ReverseElementBits(nint), ReverseElementBits(nuint)
  • ArmBase.Arm64
    • LeadingSignCount(nint)
  • Crc32
    • ComputeCrc32(uint crc, nuint data)
    • ComputeCrc32C(uint crc, nuint data)

No APIs are added beyond the approved proposal. The managed bodies are simple self-call intrinsic stubs; the JIT handles dispatch via the hwintrinsic list (see below).

JIT hwintrinsic list — 64-bit instruction slots for the 32-bit classes

In src/coreclr/jit/hwintrinsiclistarm64.h, the TYP_LONG/TYP_ULONG slots of the 32-bit-class entries are now populated so that nint/nuint overloads (which JitType2PreciseVarType maps to TYP_LONG/TYP_ULONG on 64-bit) dispatch to the correct Arm64 instruction without needing a managed-side redirect to *.Arm64.*:

  • NI_ArmBase_LeadingZeroCount — added INS_clz for TYP_LONG/TYP_ULONG
  • NI_ArmBase_ReverseElementBits — added INS_rbit for TYP_LONG/TYP_ULONG
  • NI_Crc32_ComputeCrc32 — added INS_crc32x for TYP_ULONG
  • NI_Crc32_ComputeCrc32C — added INS_crc32cx for TYP_ULONG

Codegen uses emitActualTypeSize(intrin.baseType) for HW_Category_Scalar, so the 8-byte operand size flows through automatically. On Arm32, nint/nuint is 32-bit and these TYP_LONG/TYP_ULONG slots are never reached.

JIT refactor — GenTreeHWIntrinsic auxiliary type

Switched the tracked auxiliary type from CorInfoType to var_types:

  • gtAuxiliaryJitTypegtAuxiliaryType, default changed from CORINFO_TYPE_UNDEF to TYP_UNKNOWN.
  • GetAuxiliaryJitType() / SetAuxiliaryJitType(CorInfoType) removed. GetAuxiliaryType() now returns the stored var_types directly; new SetAuxiliaryType(var_types) added.
  • All ~18 call sites migrated across hwintrinsic.cpp, hwintrinsicarm64.cpp, hwintrinsicxarch.cpp, gentree.cpp, and valuenum.cpp. Sentinels CORINFO_TYPE_PTR / CORINFO_TYPE_UNDEF are translated at the boundary to TYP_U_IMPL / TYP_UNKNOWN.
  • The now-dead getBaseJitTypeAndSizeOfSIMDType / getBaseJitTypeOfSIMDType helpers are removed from compiler.h and simd.cpp; callers use the existing var_types-returning getBaseTypeAndSizeOfSIMDType / getBaseTypeOfSIMDType.

Tests

New generator inputs added in src/tests/Common/GenerateHWIntrinsicTests/Arm/BaseTests.cs for the nint / nuint overloads of LeadingZeroCount, LeadingSignCount, ReverseElementBits, ComputeCrc32, and ComputeCrc32C.

While there, the bit-by-bit for-loop expected-result computations for all LeadingZeroCount_* and LeadingSignCount_* validators (pre-existing entries plus the new ones) were rewritten to use the BCL T.LeadingZeroCount API directly. This fixes a pre-existing non-termination bug for edge inputs (data == 0 for LZC, 0 or -1 for LSC) where the loop counter would go negative without ever exiting.

  • LeadingZeroCount_*: int expectedResult = (int)T.LeadingZeroCount(data); (with T being int / uint / long / ulong / nint / nuint).
  • LeadingSignCount_*: int expectedResult = (int)T.LeadingZeroCount(data ^ (data >> (N-1))) - 1; — an arithmetic-shift sign-extension XOR that matches Arm CLS semantics for all inputs (verified for 0N-1, -1N-1, int.MinValue0, etc.).

For the new Crc32 nuint overloads (ComputeCrc32_UIntPtr, ComputeCrc32C_UIntPtr), the validator compares against the explicit-typed ground-truth overloads via a JIT-constant branch on UIntPtr.Size:

uint expectedResult = (UIntPtr.Size == sizeof(ulong))
    ? Crc32.Arm64.ComputeCrc32(left, (ulong)right)
    : Crc32.ComputeCrc32(left, (uint)right);

This gives end-to-end coverage on both Arm32 (CRC32W via the uint overload) and Arm64 (CRC32X via the Arm64.ulong overload), confirms the new nuint overload dispatches to the correct instruction width, and catches any self-recursion or wrong-width lowering.

Scope deferred to follow-up PRs

  • AdvSimd and AdvSimd.Arm64 nint / nuint overloads (hundreds of methods in AdvSimd.cs / AdvSimd.PlatformNotSupported.cs).
  • Matching test-generator entries for AdvSimd overloads.

Testing

  • ./build.sh clr+libs -rc release0 errors, 0 warnings. JIT C++ refactor compiles clean on x64; the managed surface (including the ref assembly) compiles clean.
  • ./build.sh clr.alljits -c release — cross-compiles all JIT targets including libclrjit_universal_arm64_x64.so, exercising the updated hwintrinsiclistarm64.h. 0 errors, 0 warnings.
  • GenerateHWIntrinsicTests_Arm builds clean and emits the expected validator code for all new tests (LeadingZeroCount_*, LeadingSignCount_*, ReverseElementBits_*, ComputeCrc32_UIntPtr, ComputeCrc32C_UIntPtr) across Int32/UInt32/Int64/UInt64/IntPtr/UIntPtr variants.
  • ⚠️ No on-device Arm64 test execution was performed in this sandbox — the generated tests will exercise the new APIs end-to-end once run in the Arm64 test pipeline.

Copilot AI review requested due to automatic review settings April 23, 2026 16:38
Copilot AI linked an issue Apr 23, 2026 that may be closed by this pull request
Copilot AI review requested due to automatic review settings April 23, 2026 16:38
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 23, 2026
Copilot AI changed the title [WIP] Extend System.Runtime.Intrinsics.Arm to support nint and nuint Extend System.Runtime.Intrinsics.Arm to support nint and nuint Apr 23, 2026
Copilot AI requested a review from tannergooding April 23, 2026 16:48
@tannergooding
Copy link
Copy Markdown
Member

@copilot please go with option C. But note that in the full task list that the JIT item is expected to be unnecessary as all APIs should already have correct entries in the intrinsic list.

… type to var_types

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/244a0b75-c1b0-46b3-b53e-23ed8cf7401a

Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot April 23, 2026 17:51
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 23, 2026

@copilot please go with option C. But note that in the full task list that the JIT item is expected to be unnecessary as all APIs should already have correct entries in the intrinsic...

Done in 6f34511 — landed the ArmBase + Crc32 managed surface and the GenTreeHWIntrinsic aux-type refactor (CorInfoTypevar_types, getBaseJitTypeAndSizeOfSIMDType/getBaseJitTypeOfSIMDType removed, callers migrated to the existing var_types-returning helpers). Skipped adding JIT intrinsic list entries as you noted. No new APIs beyond the approval. Baseline ./build.sh clr+libs -rc release is clean (0/0).

AdvSimd is deferred to follow-ups as agreed.

Copilot AI changed the title Extend System.Runtime.Intrinsics.Arm to support nint and nuint Extend System.Runtime.Intrinsics.Arm to support nint and nuint (ArmBase + Crc32) Apr 23, 2026
@tannergooding tannergooding marked this pull request as ready for review April 23, 2026 18:43
Copilot AI review requested due to automatic review settings April 23, 2026 18:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements the approved System.Runtime.Intrinsics.Arm API expansion to add nint/nuint overloads for a small, scoped subset (ArmBase + Crc32), and refactors the JIT’s GenTreeHWIntrinsic “auxiliary type” tracking from CorInfoType to var_types to better match existing JIT type plumbing.

Changes:

  • Added nint/nuint overloads to ArmBase (LeadingZeroCount, ReverseElementBits) and ArmBase.Arm64 (LeadingSignCount), plus nuint overloads to Crc32 (ComputeCrc32, ComputeCrc32C).
  • Refactored GenTreeHWIntrinsic auxiliary type storage/accessors to use var_types, updating all impacted JIT call sites and removing now-dead CorInfoType SIMD base-type helpers.
  • Extended the ARM intrinsic test generator inputs to cover the new ArmBase nint/nuint overloads with pointer-size-aware validation.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/tests/Common/GenerateHWIntrinsicTests/Arm/BaseTests.cs Adds generator entries for ArmBase nint/nuint overload coverage.
src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs Updates ref assembly with the new nint/nuint overload signatures.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/ArmBase.cs Adds nint/nuint overloads to the supported implementation surface.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/ArmBase.PlatformNotSupported.cs Mirrors new ArmBase overloads for unsupported platforms.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/Crc32.cs Adds nuint overloads to Crc32 supported implementation surface.
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/Crc32.PlatformNotSupported.cs Mirrors new Crc32 overloads for unsupported platforms.
src/coreclr/jit/gentree.h Changes auxiliary type storage to var_types (gtAuxiliaryType) and updates accessors.
src/coreclr/jit/gentree.cpp Updates cloning/folding/memory-load logic to use var_types auxiliary type APIs.
src/coreclr/jit/hwintrinsic.cpp Migrates auxiliary type tracking to var_types and updates affected import paths.
src/coreclr/jit/hwintrinsicarm64.cpp Migrates auxiliary type tracking to var_types for Arm64 intrinsics import paths.
src/coreclr/jit/hwintrinsicxarch.cpp Migrates auxiliary type tracking to var_types for xarch intrinsic import paths.
src/coreclr/jit/valuenum.cpp Migrates auxiliary type checks to var_types for ARM64 constant folding/value numbering.
src/coreclr/jit/simd.cpp Removes the CorInfoType-returning SIMD base-type helper (now unused).
src/coreclr/jit/compiler.h Removes declarations for the deleted CorInfoType SIMD base-type helpers.

Comment thread src/coreclr/jit/gentree.h Outdated
… slots for nint/nuint dispatch

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/c2ab4e66-3233-4d8d-b691-668aa5db1675

Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
@tannergooding
Copy link
Copy Markdown
Member

CC. @dotnet/jit-contrib, @EgorBo, @kg for review. This exposes the nint and nuint overloads for some Arm intrinsicsi in the ArmBase and AdvSimd namespace. A follow up PR covering AdvSimd will also be done once this is completed and then same again for the xarch APIs.

Copilot AI review requested due to automatic review settings April 25, 2026 01:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

src/libraries/System.Runtime.Intrinsics/ref/System.Runtime.Intrinsics.cs:1

  • The overload ordering for ComputeCrc32* is now inconsistent (e.g., ushort appears after uint, while byte/nuint/uint are earlier). For ref assemblies this impacts how APIs are presented in diffs/reviews and can affect tooling that expects stable ordering. Reorder overloads consistently (commonly by increasing operand size: byte, ushort, uint, nuint/ulong as applicable) for both ComputeCrc32 and ComputeCrc32C.

Comment thread src/tests/Common/GenerateHWIntrinsicTests/Arm/BaseTests.cs Outdated
Comment thread src/tests/Common/GenerateHWIntrinsicTests/Arm/BaseTests.cs Outdated
Comment thread src/coreclr/jit/hwintrinsicarm64.cpp
Comment thread src/coreclr/jit/hwintrinsicarm64.cpp
@github-actions
Copy link
Copy Markdown
Contributor

🤖 Copilot Code Review — PR #127327

Note

This review was generated by GitHub Copilot.

Holistic Assessment

Motivation: Justified and well-scoped. The PR implements a subset of the approved API proposal (#52027, api-approved label present) to extend System.Runtime.Intrinsics.Arm with nint/nuint overloads for ArmBase and Crc32. The auxiliary-type refactor from CorInfoType to var_types is tightly coupled — it simplifies the internal representation and removes ~300 lines of dead duplication.

Approach: Sound. Storing var_types directly in gtAuxiliaryType eliminates a redundant indirection layer (CorInfoTypevar_types conversion). The HW intrinsic list change to populate 64-bit instruction slots for the 32-bit class entries is the cleanest way to enable nint/nuint dispatch without needing managed redirects to *.Arm64.*. The PR correctly scopes out the much larger AdvSimd surface to follow-up work.

Summary: ✅ LGTM. The refactor is semantically correct across all ~18 migrated call sites, the new APIs match the approved proposal exactly, and the HW intrinsic list entries are correct. One minor pre-existing typo is noted below. The PR has already been approved by @tannergooding.


Detailed Findings

✅ API Approval Verification — Approved via #52027

Issue #52027 has the api-approved label. The approved API shape (posted by @bartonjs on 2021-05-27) includes exactly the APIs added in this PR:

  • ArmBase.LeadingZeroCount(nint), ArmBase.LeadingZeroCount(nuint)
  • ArmBase.ReverseElementBits(nint), ArmBase.ReverseElementBits(nuint)
  • ArmBase.Arm64.LeadingSignCount(nint)
  • Crc32.ComputeCrc32(uint, nuint), Crc32.ComputeCrc32C(uint, nuint)

The ref assembly, src/ implementation, and PlatformNotSupported mirror all match the approved shape. No unapproved API surface is introduced. The remaining AdvSimd APIs are explicitly deferred to follow-up PRs, which is consistent with the PR description.

✅ Auxiliary Type Refactor — Semantically correct

The CorInfoTypevar_types migration is verified correct at all key conversion points:

  • CORINFO_TYPE_PTRTYP_U_IMPL: Confirmed that JitType2PreciseVarType(CORINFO_TYPE_PTR) returns TYP_U_IMPL (per ee_il_dll.hpp:278). The old code stored CORINFO_TYPE_PTR and compared via GetAuxiliaryJitType(); the new code stores TYP_U_IMPL directly and compares via GetAuxiliaryType(). Semantics preserved.
  • CORINFO_TYPE_UNDEFTYP_UNKNOWN: Correct sentinel mapping. The old GetAuxiliaryType() converter explicitly returned TYP_UNKNOWN for CORINFO_TYPE_UNDEF.
  • CORINFO_TYPE_ULONGTYP_ULONG: Direct mapping, verified in valuenum.cpp and gentree.cpp fold paths.
  • All getBaseJitTypeOfSIMDType()getBaseTypeOfSIMDType(): These are the CorInfoType vs var_types returning variants of the same logic. The refactor correctly switches to the var_types variant at all call sites.

✅ StoreNarrowing ptrType Initialization — Dead store removed correctly

In hwintrinsicarm64.cpp:3182, the old code initialized ptrType via getBaseJitTypeOfSIMDType(argClass), but this value was immediately overwritten at lines 3189–3191 by strip(...) and getChildType(...). The new code initializes to CORINFO_TYPE_UNDEF (a safe sentinel), which is equally dead. No behavioral change.

✅ Dead Code Removal — getBaseJitTypeAndSizeOfSIMDType fully dead

Grep confirms zero remaining callers of getBaseJitTypeAndSizeOfSIMDType or getBaseJitTypeOfSIMDType after this PR. The ~300 lines removed from simd.cpp were a near-exact duplicate of getBaseTypeAndSizeOfSIMDType but returning CorInfoType instead of var_types. Good cleanup.

✅ HW Intrinsic List Entries — Correct instruction slots

The 10-slot instruction array maps to {TYP_BYTE, TYP_UBYTE, TYP_SHORT, TYP_USHORT, TYP_INT, TYP_UINT, TYP_LONG, TYP_ULONG, TYP_FLOAT, TYP_DOUBLE}. The PR populates slots 6–7 (TYP_LONG/TYP_ULONG) which is where nint/nuint map on 64-bit targets:

  • ArmBase.LeadingZeroCount: INS_clz at positions 6,7 ✓
  • ArmBase.ReverseElementBits: INS_rbit at positions 6,7 ✓
  • Crc32.ComputeCrc32: INS_crc32x at position 7 (TYP_ULONG only, since CRC data param is nuint) ✓
  • Crc32.ComputeCrc32C: INS_crc32cx at position 7 ✓

ArmBase.Arm64.LeadingSignCount already had INS_cls at position 6 — no change needed there. ✓

✅ Ref Assembly Ordering — Alphabetical

The ref assembly changes maintain alphabetical ordering by parameter type name: int < nint < nuint < uint and byte < nuint < uint < ushort. The ushort overloads were correctly repositioned after uint to maintain this order.

✅ Test Coverage — Adequate for scoped APIs

New test generator inputs cover LeadingZeroCount(nint/nuint), ReverseElementBits(nint/nuint), and LeadingSignCount(nint) with bit-size-aware validation logic. The Crc32(nuint) tests are explicitly deferred (per PR description) due to platform-conditional expected values that the current test template doesn't support cleanly. This is a reasonable deferral.

💡 Pre-existing Typo — "than" → "that" in comment

gentree.h:6418: The comment reads "For intrinsics than need another type" — should be "that need". This typo predates the PR (it existed in the old gtAuxiliaryJitType comment too), so it's not blocking, but since the line was touched it would be a nice fix.

Contributing models: Claude Opus 4.6 (primary), Claude Haiku 4.5 (type mapping exploration). GPT-5.3-codex was launched but did not complete within the timeout window.

Generated by Code Review for issue #127327 ·

…ors with T.LeadingZeroCount

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/d634db26-845d-4344-8cb7-00794c9dd590

Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
…Crc32C_UIntPtr)

Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/d9436184-20aa-448e-87fa-61358beca1c7

Co-authored-by: tannergooding <10487869+tannergooding@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot April 25, 2026 01:49
@tannergooding tannergooding merged commit 9bff0b5 into main Apr 27, 2026
182 of 184 checks passed
@tannergooding tannergooding deleted the copilot/extend-arm-intrinsics-support-nint-nuint branch April 27, 2026 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend System.Runtime.Intrinsics.Arm to support nint and nuint

4 participants