Skip to content

JIT: gate Vector512.ConvertToInt32(Native) on AVX-512#127499

Closed
Copilot wants to merge 2 commits intomainfrom
copilot/fix-vector512-converttoint32-test
Closed

JIT: gate Vector512.ConvertToInt32(Native) on AVX-512#127499
Copilot wants to merge 2 commits intomainfrom
copilot/fix-vector512-converttoint32-test

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 28, 2026

Vector512.ConvertToInt32 and ConvertToInt32Native produce wrong results on non-AVX512 x64 hardware (e.g. only 4 of 16 lanes populated, incorrect saturation values). Recent Vector512 constant propagation in the JIT (#127124, #127402) caused intrinsic expansion to reach a code path that unconditionally emits AVX-512-only instructions, surfacing a missing ISA gate originally introduced in #84932.

Description

  • src/coreclr/jit/hwintrinsicxarch.cpp — Add an AVX-512 gate for NI_Vector512_ConvertToInt32 and NI_Vector512_ConvertToInt32Native, mirroring the existing ConvertToInt64 pattern. When the gate fails, retNode stays null and the call falls back to the managed implementation (which splits into two Vector256.ConvertToInt32 operations).

    case NI_Vector128_ConvertToInt32:
    case NI_Vector256_ConvertToInt32:
    case NI_Vector512_ConvertToInt32:
    {
        assert(sig->numArgs == 1);
        assert(simdBaseType == TYP_FLOAT);
    
        if ((simdSize == 64) && !compOpportunisticallyDependsOn(InstructionSet_AVX512))
        {
            break;
        }
    
        op1     = impSIMDPopStack();
        retNode = gtNewSimdCvtNode(retType, op1, TYP_INT, simdBaseType, simdSize);
        break;
    }
  • src/coreclr/jit/gentree.cpp — Tighten the assertions in gtNewSimdCvtNode and gtNewSimdCvtNativeNode so simdSize == 64 always requires AVX-512, catching any future bypass of the gate in Debug/Checked builds:

    assert(compIsaSupportedDebugOnly(InstructionSet_AVX512) ||
           ((simdTargetBaseType == TYP_INT) && (simdSize != 64)));

The existing Vector512Tests.ConvertToInt32Test / ConvertToInt32NativeTest cover this; both pass on a non-AVX512 host with the fix applied.

Copilot AI requested review from Copilot and removed request for Copilot April 28, 2026 11:38
Copilot AI changed the title [WIP] Fix test failure in Vector512Tests.ConvertToInt32Test JIT: gate Vector512.ConvertToInt32(Native) on AVX-512 Apr 28, 2026
Copilot AI requested a review from EgorBo April 28, 2026 11:42
assert(sig->numArgs == 1);
assert(simdBaseType == TYP_FLOAT);

if ((simdSize == 64) && !compOpportunisticallyDependsOn(InstructionSet_AVX512))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Vector256 not need AVX here too?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - Vector256.ConvertToInt32 is already gated upstream in lookupId (returns NI_Illegal on no-AVX hosts, falling back to the managed body). On AVX-without-AVX2 it's allowed via HW_Flag_AvxOnlyCompatible and gtNewSimdCvtNativeNode emits a plain AVX vcvttps2dq ymm. Vector512 is the odd one out because there's no non-AVX-512 single-op equivalent for the 64-byte conversion.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be gated upstream, however, as we shouldn't even be getting simdSize == 64 if its not accelerated.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right here: https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/hwintrinsic.cpp#L1371-L1377

So we should never even be producing an intrinsic ID for this in the first place.

@mangod9 mangod9 added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-System.Reflection labels Apr 28, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@EgorBo
Copy link
Copy Markdown
Member

EgorBo commented Apr 28, 2026

closing in favor of #127524

@EgorBo EgorBo closed this Apr 28, 2026
EgorBo added a commit to EgorBo/runtime-1 that referenced this pull request Apr 28, 2026
…tNode

When `op1` is not invariant or a local, `fgMakeMultiUse(&op1)` rewrites
`op1` into `COMMA(STORE temp, LCL_VAR temp)` and returns a fresh load of
`temp`. The STORE is the only place `temp` is written, so it must
evaluate before any later read of `temp`.

The non-AVX-512 branch passed the clean clone as the first argument of
`AND_NOT` and the COMMA-wrapped tree as the second. `AND_NOT(a, b)`
decomposes into `AND(a, NOT(b))`, so `a` evaluates first - meaning the
`LCL_VAR temp` read happened before the STORE inside `b`, producing
garbage for the non-NaN'd input.

The bug was latent: prior to dotnet#127124 / dotnet#127402 the inner `IsNaN(op1)`
expanded into real per-element compares that kept enough materialization
around to mask the bad ordering. With SIMD32/64 constant propagation,
`CompareNotEqual(temp, temp)` value-numbers as AllBitsSet and the entire
right subtree collapses to constants, leaving only the broken left-side
read - which is what the `Vector512Tests.ConvertToInt32Test` failure on
non-AVX-512 hosts (libraries-jitstress-random, nativeaot-outerloop,
iossimulator) was actually exercising.

Fix: pass `op1` (the COMMA, evaluated first) as `AND_NOT`'s first
argument and use the side-effect-free `op1Clone1` for the IsNaN check.

Verified by repro on a non-AVX-512 host (DOTNET_EnableAVX512=0):
  Vector512.ConvertToInt32(Vector512.Create(float.MinValue))
now returns Vector512<int>.Create(int.MinValue) as expected. SPMI
benchmarks.run replay clean.

Fixes dotnet#127440.
Supersedes dotnet#127499 (which gated unreachable code in `impSpecialIntrinsic`
- `NI_Vector512_ConvertToInt32` is already filtered upstream by
`lookupId` on non-AVX-512 hosts, so that gate did not actually fix the
failure).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test failure: System.Runtime.Intrinsics.Tests.Vectors.Vector512Tests.ConvertToInt32Test

5 participants