Skip to content

[RyuJIT] Change VEX-encoding selection to avoid AVX-SSE transition penalties #8966

@fiigii

Description

@fiigii

Currently, RyuJIT generates AVX instructions (VEX-encoding) when AVX2 is available:

  • Floating-point calculations and SIMD code use VEX-encoding instructions on AVX2-capable machines (Haswell and above).
  • SIMD vectors (System.Numerics.Vector<T>) has the size of 256-bit (YMM) on AVX2-capable machines and size of 128-bit (XMM) on machines that support AVX (Sandy Bridge) or blow ISA.

However, we will broadly use AVX instructions via Intel hardware intrinsics even if the underlying hardware has no AVX2. Therefore, mixing use of Avx intrinsics and floating-point calculation (or System.Numerics.Vectors) may trigger AVX-SSE transition penalties with the current codegen strategy. The new VEX-encoding selection strategy should be:

  • Floating-point calculations use VEX-encoding instructions on AVX-capable machines (Sandy Bridge and above).
  • SIMD vectors (System.Numerics.Vector<T>) has the size of 256-bit (YMM) on AVX2-capable machines and size of 128-bit (XMM) on machines that support AVX (Sandy Bridge) or blow ISA. // no change
  • SIMD code (System.Numerics.Vectors) is compiled to instructions that have VEX.128 prefix and operate over XMM registers on AVX-capable machines, but compiled to instructions that have VEX.256 prefix and operate over YMM registers on AVX2-capable machines.

I will provide a PR after finish dotnet/coreclr#14020 .

Metadata

Metadata

Assignees

Labels

area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsoptimizationtenet-performancePerformance related issue

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions