JIT: Accelerate floating->long casts on x86 by saucecontrol · Pull Request #125180 · dotnet/runtime

saucecontrol · 2026-03-04T15:43:15Z

This adds floating->long/ulong cast codegen for AVX-512 and AVX10.2 on x86. With this, all non-overflow casts are now hardware accelerated. This is the last bit pulled from #116805.

Typical Diff (double->long AVX-512):

-       sub      esp, 8
-       vzeroupper 
-       vmovsd   xmm0, qword ptr [esp+0x0C]
-       sub      esp, 8
-       ; npt arg push 0
-       ; npt arg push 1
-       vmovsd   qword ptr [esp], xmm0
-       call     CORINFO_HELP_DBL2LNG
-       ; gcr arg pop 2
+       vmovsd   xmm0, qword ptr [esp+0x04]
+       vcmpordsd k1, xmm0, xmm0
+       vcmpge_oqsd k2, xmm0, qword ptr [@RWD00]
+       vcvttpd2qq xmm0 {k1}{z}, xmm0
+       vpblendmq xmm0 {k2}, xmm0, qword ptr [@RWD08] {1to2}
+       vmovd    eax, xmm0
+       vpextrd  edx, xmm0, 1
-       add      esp, 8
        ret      8

+RWD00  	dq	43E0000000000000h
+RWD08  	dq	7FFFFFFFFFFFFFFFh
 
-; Total bytes of code 31
+; Total bytes of code 53

Full Diffs

Breakdown of the double->long asm:

; load the scalar double
vmovsd   xmm0, qword ptr [esp+0x04]

; set the low bit of k1 if the scalar value is not NaN
vcmpordsd k1, xmm0, xmm0

; set the low bit of k2 if the input was greater than or equal to 2^63 (nearest double greater than long.MaxValue)
vcmpge_oqsd k2, xmm0, qword ptr [@RWD00]

; convert, using k1 mask bit.  if the mask bit is not set (meaning we have a NaN), set the value to zero
vcvttpd2qq xmm0 {k1}{z}, xmm0

; if the low bit of k2 is set (meaning overflow), set the value to long.MaxValue, otherwise take the conversion result
vpblendmq xmm0 {k2}, xmm0, qword ptr [@RWD08] {1to2}

; extract the two 32-bit halves of the long result
vmovd    eax, xmm0
vpextrd  edx, xmm0, 1

Copilot

Pull request overview

This PR extends x86 JIT codegen to hardware-accelerate non-overflow floating→long/ulong casts using AVX-512 and AVX10.2, completing the remaining cast-acceleration work pulled from #116805.

Changes:

Teach cast helper selection to allow floating↔long casts to stay intrinsic-based on x86 when AVX-512 is available.
Add/extend x86 long decomposition logic to generate AVX-512/AVX10.2 sequences for floating→long/ulong and long→floating casts.
Introduce a new AVX-512 scalar compare-mask intrinsic and wire it up for immediate bounds + containment.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/coreclr/jit/lowerxarch.cpp	Refactors vector constant construction and adds containment support for the new AVX-512 scalar compare-mask intrinsic.
src/coreclr/jit/hwintrinsicxarch.cpp	Adds immediate upper-bound handling for the new AVX-512 scalar compare-mask intrinsic.
src/coreclr/jit/hwintrinsiclistxarch.h	Introduces `AVX512.CompareScalarMask` as a new intrinsic mapping to `vcmpss/vcmpsd` with IMM.
src/coreclr/jit/flowgraph.cpp	Updates helper-requirement logic so x86 floating↔long casts can avoid helper calls when AVX-512 is available.
src/coreclr/jit/decomposelongs.cpp	Implements the AVX-512/AVX10.2-based lowering/decomposition sequences for floating↔long/ulong on x86.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

saucecontrol · 2026-03-04T19:32:01Z

@dotnet/jit-contrib this is ready for review

diffs

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

JulieLeeMSFT · 2026-04-13T12:08:48Z

@EgorBo, please review this community PR.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/coreclr/jit/flowgraph.cpp:1347

fgCastRequiresHelper on x86 currently only exempts long<->floating casts when InstructionSet_AVX512 is enabled. This PR adds long/floating cast acceleration that can use AVX10.2 (InstructionSet_AVX10v2) as well (e.g., DecomposeLongs::DecomposeCast checks compOpportunisticallyDependsOn(InstructionSet_AVX10v2)). If AVX10v2 is enabled while AVX512 is disabled/unavailable, morphing may still force a helper call and bypass the new codegen. Consider updating the x86 condition to treat AVX10v2 as sufficient (e.g., require helper only when neither AVX512 nor AVX10v2 is available).

#if defined(TARGET_X86) || defined(TARGET_ARM)
    if ((varTypeIsLong(fromType) && varTypeIsFloating(toType)) ||
        (varTypeIsFloating(fromType) && varTypeIsLong(toType)))
    {
#if defined(TARGET_X86)
        return !compOpportunisticallyDependsOn(InstructionSet_AVX512);
#else

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

tannergooding

CC. @dotnet/jit-contrib, @EgorBo, @kg for secondary review on the community PR

saucecontrol · 2026-04-22T19:02:00Z

Pushed a change to simplify the IR. No change to the codegen.

Copilot AI review requested due to automatic review settings March 4, 2026 15:43

dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Mar 4, 2026

github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 4, 2026

Copilot started reviewing on behalf of saucecontrol March 4, 2026 15:44 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Comment thread src/coreclr/jit/decomposelongs.cpp Outdated

Comment thread src/coreclr/jit/decomposelongs.cpp Outdated

Copilot AI review requested due to automatic review settings March 4, 2026 16:08

Copilot started reviewing on behalf of saucecontrol March 4, 2026 16:09 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

saucecontrol added 2 commits March 4, 2026 09:46

accelerate floating->long casts on x86

47a6bc8

rename variable

9b393fb

saucecontrol force-pushed the lng2flt6 branch from dfda2d3 to 9b393fb Compare March 4, 2026 17:52

saucecontrol marked this pull request as ready for review March 4, 2026 19:31

Copilot AI review requested due to automatic review settings March 4, 2026 19:31

Copilot started reviewing on behalf of saucecontrol March 4, 2026 19:32 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

Comment thread src/coreclr/jit/decomposelongs.cpp

Comment thread src/coreclr/jit/decomposelongs.cpp

build-analysis Bot mentioned this pull request Mar 5, 2026

Android WebSocket failure #121518

Open

JulieLeeMSFT requested a review from EgorBo April 13, 2026 12:08

Merge branch 'main' into lng2flt6

708800e

Merge remote-tracking branch 'upstream/main' into lng2flt6

e656c5f

Copilot AI review requested due to automatic review settings April 16, 2026 19:48

Copilot started reviewing on behalf of saucecontrol April 16, 2026 19:49 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

Comment thread src/coreclr/jit/hwintrinsiclistxarch.h Outdated

Update src/coreclr/jit/hwintrinsiclistxarch.h

bfede63

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 16, 2026 19:55

saucecontrol added 2 commits April 19, 2026 21:34

Merge remote-tracking branch 'upstream/main' into lng2flt6

b828000

add costing info

27c7ff2

Copilot AI review requested due to automatic review settings April 20, 2026 04:36

Copilot started reviewing on behalf of saucecontrol April 20, 2026 04:37 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

Comment thread src/coreclr/jit/gentree.cpp

fix comment typo

23fbdbf

This was referenced Apr 20, 2026

[wasm] WBT SatelliteAssembliesTests.CheckThatSatelliteAssembliesAreNotAOTed failing #90458

Open

Unable to pull image from mcr.microsoft.com #117164

Open

tannergooding reviewed Apr 20, 2026

View reviewed changes

Comment thread src/coreclr/jit/decomposelongs.cpp Outdated

tannergooding added the needs-author-action An issue or pull request that requires more info or actions from the author. label Apr 21, 2026

use normal VEC_CNS for long.MaxValue

c8f26cb

Copilot AI review requested due to automatic review settings April 22, 2026 00:47

Copilot started reviewing on behalf of saucecontrol April 22, 2026 00:48 View session

dotnet-policy-service Bot removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Apr 22, 2026

Copilot AI reviewed Apr 22, 2026

View reviewed changes

build-analysis Bot mentioned this pull request Apr 22, 2026

browser-wasm linux Release LibraryTests queues timing out #117974

Open

tannergooding reviewed Apr 22, 2026

View reviewed changes

Comment thread src/coreclr/jit/decomposelongs.cpp Outdated

tannergooding approved these changes Apr 22, 2026

View reviewed changes

tannergooding requested a review from kg April 22, 2026 17:30

use BlendVariableMask directly

858db2c

Copilot AI mentioned this pull request Apr 23, 2026

Fix race condition: set _canceled before SignalCore in ProcessWaitState #127312

Merged

tannergooding approved these changes Apr 23, 2026

View reviewed changes

kg approved these changes Apr 29, 2026

View reviewed changes

tannergooding enabled auto-merge (squash) April 29, 2026 05:30

tannergooding merged commit d1163e5 into dotnet:main Apr 29, 2026
135 of 137 checks passed

saucecontrol deleted the lng2flt6 branch April 29, 2026 16:51

Conversation

saucecontrol commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

saucecontrol commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

JulieLeeMSFT commented Apr 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

tannergooding left a comment

Choose a reason for hiding this comment

Uh oh!

saucecontrol commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

saucecontrol commented Mar 4, 2026 •

edited

Loading

saucecontrol commented Mar 4, 2026 •

edited

Loading