Allow O2K_CONST_VEC to handle all TYP_SIMD32/64 by tannergooding · Pull Request #127402 · dotnet/runtime

tannergooding · 2026-04-24T21:00:52Z

This provides O2K_CONST_VEC support for all SIMD constants by heap allocating for values larger than TYP_SIMD16. It is an alternative to #127390.

Extend the O2K_CONST_VEC assertion storage to handle TYP_SIMD32/64 via a heap-allocated simd_t* (m_bigSimdVal) in the existing union. SIMD8/12/16 continue to use the inline simd16_t. The active SIMD byte size is tracked in m_simdSize, unioned with m_encodedIconFlags to keep struct size unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…equality Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dotnet-policy-service · 2026-04-24T21:02:17Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

Enables O2K_CONST_VEC assertions to represent all SIMD constants (including TYP_SIMD32/64) by storing wider constants out-of-line, allowing assertion propagation and related optimizations to apply to more SIMD constant cases.

Changes:

Extend AssertionDsc SIMD-constant storage to support TYP_SIMD32/64 via heap allocation, while keeping TYP_SIMD8/12/16 inline.
Update assertion equality and constant propagation logic to use a size-tagged SIMD payload (GetSimdSize()/GetSimdConstant()).
Expand optAssertionGenJtrue handling to recognize Vector256/Vector512 equality/inequality intrinsics on xarch.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
src/coreclr/jit/compiler.h	Adds size-tagged SIMD constant payload handling in `AssertionDsc` with inline vs heap-backed storage for wider vectors.
src/coreclr/jit/assertionprop.cpp	Creates and propagates SIMD constant assertions for all SIMD sizes, and recognizes additional SIMD equality/inequality intrinsics.

…tNode When `op1` is not invariant or a local, `fgMakeMultiUse(&op1)` rewrites `op1` into `COMMA(STORE temp, LCL_VAR temp)` and returns a fresh load of `temp`. The STORE is the only place `temp` is written, so it must evaluate before any later read of `temp`. The non-AVX-512 branch passed the clean clone as the first argument of `AND_NOT` and the COMMA-wrapped tree as the second. `AND_NOT(a, b)` decomposes into `AND(a, NOT(b))`, so `a` evaluates first - meaning the `LCL_VAR temp` read happened before the STORE inside `b`, producing garbage for the non-NaN'd input. The bug was latent: prior to dotnet#127124 / dotnet#127402 the inner `IsNaN(op1)` expanded into real per-element compares that kept enough materialization around to mask the bad ordering. With SIMD32/64 constant propagation, `CompareNotEqual(temp, temp)` value-numbers as AllBitsSet and the entire right subtree collapses to constants, leaving only the broken left-side read - which is what the `Vector512Tests.ConvertToInt32Test` failure on non-AVX-512 hosts (libraries-jitstress-random, nativeaot-outerloop, iossimulator) was actually exercising. Fix: pass `op1` (the COMMA, evaluated first) as `AND_NOT`'s first argument and use the side-effect-free `op1Clone1` for the IsNaN check. Verified by repro on a non-AVX-512 host (DOTNET_EnableAVX512=0): Vector512.ConvertToInt32(Vector512.Create(float.MinValue)) now returns Vector512<int>.Create(int.MinValue) as expected. SPMI benchmarks.run replay clean. Fixes dotnet#127440. Supersedes dotnet#127499 (which gated unreachable code in `impSpecialIntrinsic` - `NI_Vector512_ConvertToInt32` is already filtered upstream by `lookupId` on non-AVX-512 hosts, so that gate did not actually fix the failure). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@rwd00

…tNode (#127524) Fixes a use-before-def in the non-AVX-512 saturation path of `Compiler::gtNewSimdCvtNode` that produces wrong results for `Vector{128,256,512}.ConvertToInt32(Vector*<float>)` when the input is not invariant or a local. ## Repro (DOTNET_EnableAVX512=0) ```cs [MethodImpl(MethodImplOptions.NoInlining)] static Vector512<int> Test() => Vector512.ConvertToInt32(Vector512.Create(float.MinValue)); ``` Before this PR: ``` <0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2147483647, 0, 0, 0> ``` After this PR: ``` <-2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648> ``` ## Root cause ```cpp GenTree* op1Clone1 = fgMakeMultiUse(&op1); // op1 := COMMA(STORE temp = orig, LCL_VAR temp); op1Clone1 := fresh LCL_VAR temp GenTree* mask1 = gtNewSimdIsNaNNode(type, op1, ...); // uses op1 (the COMMA) fixupVal = gtNewSimdBinOpNode(GT_AND_NOT, op1Clone1, mask1, …); // AND_NOT(LCL_VAR temp, IsNaN(COMMA(STORE temp, ...))) ``` `AND_NOT(a, b)` decomposes into `AND(a, NOT(b))`, so the left operand `a` evaluates first. The `LCL_VAR temp` read happens **before** the `STORE temp` that lives inside the COMMA on the right, so the AND consumes whatever was on the stack. Disasm on main with AVX-512 disabled (only the relevant block, V05 / V08 are the temps in question): ```asm vpcmpeqd ymm0, ymm0, ymm0 vandps ymm0, ymm0, ymmword ptr [rsp+0x20] ; reads V05 - never written vbroadcastss ymm1, dword ptr [reloc @rwd00] vcmpgeps ymm2, ymm0, ymm1 vbroadcastss ymm3, dword ptr [reloc @RWD04] vcvttps2dq ymm0, ymm0 vpblendvb ymm0, ymm0, ymm3, ymm2 vmovups ymm2, ymmword ptr [rsp] ; reads V08 - never written ... ``` The bug has existed since `gtNewSimdCvtNode` was first introduced; it stayed latent because pre-#127124 / #127402 the inner `IsNaN(op1)` expanded into per-element compares that kept enough materialization around to mask the bad ordering. With SIMD32/64 constant propagation, `CompareNotEqual(temp, temp)` value-numbers as AllBitsSet and the whole right subtree collapses to constants, leaving only the broken left-side read - which is exactly what `Vector512Tests.ConvertToInt32Test` started catching on non-AVX-512 hosts. ## Fix Two-line swap: pass `op1` (the COMMA, evaluated first) as `AND_NOT`'s left arg; use the clone for the IsNaN check. ```cpp GenTree* op1Clone1 = fgMakeMultiUse(&op1); GenTree* mask1 = gtNewSimdIsNaNNode(type, op1Clone1, simdSourceBaseType, simdSize); fixupVal = gtNewSimdBinOpNode(GT_AND_NOT, type, op1, mask1, simdSourceBaseType, simdSize); ``` Now the COMMA evaluates first, the STORE happens, both subsequent reads of the temp get the correct value. ## Validation - Repro produces correct output on AVX-512 disabled and AVX-512 enabled. - `Vector512Tests.ConvertToInt32Test` / `ConvertToInt32NativeTest` pass with `DOTNET_EnableAVX512=0`. - SuperPMI replay against `benchmarks.run` clean (38409 contexts, 0 failures, 0 asserts). ## Note on #127499 That PR adds an AVX-512 gate in `impSpecialIntrinsic`'s `NI_Vector512_ConvertToInt32` case. As @tannergooding pointed out in #127499 (comment) / #127499 (comment), that case is already unreachable on non-AVX-512 hosts because `Compiler::lookupId` returns `NI_Illegal` for `Vector512` ISA when AVX-512 is not opportunistically supported. I verified that applying #127499 alone leaves the test failing - the gate it adds is dead code. This PR addresses the actual bug; #127499 can be closed. Fixes #127440. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

EgorBo and others added 4 commits April 24, 2026 17:48

JIT: enable assertion creation for Vector256/Vector512 op_Equality/In…

9715868

…equality Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

JIT: simplify SIMD size lookup using genTypeSize

6615f82

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fix build failures

0327f8a

Copilot AI review requested due to automatic review settings April 24, 2026 21:00

github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 24, 2026

dotnet-policy-service Bot assigned tannergooding Apr 24, 2026

Copilot started reviewing on behalf of tannergooding April 24, 2026 21:01 View session

tannergooding mentioned this pull request Apr 24, 2026

Allow O2K_CONST_VEC to handle all TYP_SIMD32/64 #127401

Closed

Copilot AI reviewed Apr 24, 2026

View reviewed changes

tannergooding marked this pull request as ready for review April 25, 2026 00:00

EgorBo approved these changes Apr 25, 2026

View reviewed changes

tannergooding mentioned this pull request Apr 25, 2026

Allow O2K_CONST_VEC to handle some TYP_SIMD32/64 #127390

Closed

tannergooding enabled auto-merge (squash) April 25, 2026 00:05

tannergooding merged commit 51cfc7a into dotnet:main Apr 25, 2026
137 of 141 checks passed

dotnet-maestro Bot mentioned this pull request Apr 26, 2026

[main] Source code updates from dotnet/runtime dotnet/dotnet#6275

Merged

mangod9 mentioned this pull request Apr 28, 2026

Test failure: System.Runtime.Intrinsics.Tests.Vectors.Vector512Tests.ConvertToInt32Test #127440

Closed

Copilot AI mentioned this pull request Apr 28, 2026

JIT: gate Vector512.ConvertToInt32(Native) on AVX-512 #127499

Closed

EgorBo mentioned this pull request Apr 28, 2026

JIT: fix Vector{128,256}.ConvertToInt32 use-before-def in gtNewSimdCvtNode #127524

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow O2K_CONST_VEC to handle all TYP_SIMD32/64#127402

Allow O2K_CONST_VEC to handle all TYP_SIMD32/64#127402
tannergooding merged 4 commits intodotnet:mainfrom
tannergooding:jit/wide-simd-assertions

tannergooding commented Apr 24, 2026

Uh oh!

dotnet-policy-service Bot commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tannergooding commented Apr 24, 2026

Uh oh!

dotnet-policy-service Bot commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants