Skip to content

[release/5.0] Prevent emitting Avx2 instruction for Vector256<T>.AllBitsSet and Vector256.Create(-1) when Avx2 is not supported#48630

Merged
Anipik merged 1 commit intodotnet:release/5.0from
echesakov:Rel5.0-Runtime_48283
Mar 10, 2021
Merged

[release/5.0] Prevent emitting Avx2 instruction for Vector256<T>.AllBitsSet and Vector256.Create(-1) when Avx2 is not supported#48630
Anipik merged 1 commit intodotnet:release/5.0from
echesakov:Rel5.0-Runtime_48283

Conversation

@echesakov
Copy link
Contributor

This is backporting of #48383 to 5.0
Closes #48283

Customer Impact

In the two following cases

 var ones = Vector256<int>.AllBitsSet;

and

var minusOnes = Vector256.Create(-1);

on x86 and x64 the JIT emits vpcmpeqd ymmReg, ymmReg, ymmReg instruction (that is supported by Avx2 CPU feature - see pcmpeqb:pcmpeqw:pcmpeqd) even on a machine that doesn't support the feature.

For example, the issue occurs with CPUs that belong to Sandy Bridge or Ivy Bridge microarchitectures.

As you can see in #48283, a customer originally observed the issue on an Ivy Bridge machine. The execution of such code on their machine resulted in crash followed by terminating of the process by OS.

Validation

I don't have access to a machine that doesn't support Avx2. Therefore I used the same methodology we use in CI for ISA-related test scenarios (e.g. runtime-coreclr jitstress-isas-x86) where we set corresponding COMPlus environment variables to simulate that a CPU feature is not supported.

I generated all possible variations of Vector128/256<T>.AllBitsSet and Vector128/256.Create((T)-1) (in the latter case T is a signed integer type) (see Runtime_48283.cs) and manually validated that the JIT doesn't emit the forbidden instructions.

You can find below results of the JIT output for the following three cases - noavx (simulated with COMPlus_EnableAVX=0), avx (simulated with COMPlus_EnableAVX2=0) and avx2 (when none of the ISAs were forbidden) for both x86 and x64.

After the change, the JIT will emit vcmptrueps ymmReg, ymmReg, ymmReg for such cases. This is similar to what Clang does - https://godbolt.org/z/erd5sf.

Risk

Low, the changes impacts only code generation of hardware intrinsics Vector128/256<int>.AllBitsSet that were introduces in 5.0 and optimization in the JIT of Vector256.Create(-1) that uses the mentioned intrinsics (that was also done in 5.0).

Regression

Yes, the Vector256.Create(-1) on 3.1 wouldn't use Avx2 instruction.
There is no Vector128/256<int>.AllBitsSet API in 3.1.

Note

The change depends on #48613 being merged - where another problem with Vector256.Create(-1) optimization was fixed.
The problem is that, at the moment, the JIT lowers both Vector128.Create(-1) and Vector256.Create(-1) to Vector128<T>.AllBitsSet without taking into account the vector size (see src/coreclr/src/jit/lowerxarch.cpp

…ector256<T>.AllBitsSet` intrinsic when Avx2 is not supported. Emit `vcmptrueps ymmReg, ymmReg, ymmReg` instead
@echesakov echesakov added arch-x86 arch-x64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Feb 22, 2021
@echesakov
Copy link
Contributor Author

@dotnet/jit-contrib @tannergooding PTAL

@echesakov echesakov self-assigned this Feb 23, 2021
@echesakov echesakov added this to the 5.0.x milestone Feb 23, 2021
@echesakov
Copy link
Contributor Author

Can I have sign-off from someone on @dotnet/jit-contrib please?

Copy link
Contributor

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@echesakov echesakov added the Servicing-consider Issue for next servicing release review label Mar 1, 2021
@echesakov
Copy link
Contributor Author

cc @jeffschwMSFT

Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. We will take for consideration in .NET 5.0.x

@leecow leecow added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Mar 2, 2021
@leecow leecow modified the milestones: 5.0.x, 5.0.5 Mar 2, 2021
@Anipik Anipik merged commit b59f453 into dotnet:release/5.0 Mar 10, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Apr 9, 2021
@echesakov echesakov deleted the Rel5.0-Runtime_48283 branch April 13, 2021 22:19
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

arch-x64 arch-x86 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Servicing-approved Approved for servicing release

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Vector256<Integer>.AllBitsSet emits illegal instructions on a machine without AVX2

7 participants