RyuJIT SIMD: front-end optimizations for SIMD types

There are quite a few front-end optimizations that RyuJIT can do on SIMD vector types

1) Constant vector propagation.
E.g. Back-end (lowerxarch.cpp) looks to see if (in)Equality operation is against a zero vector. It checks to see if op2 is a zero vector.  If front-end phases propagate constant vectors and make sure they appear as op2 in a comparison, will increase the chance of back-end making the optimization.

2)  CSE'ing of operations on SIMD types
Right now the SetEvalOder() costs need to be updated for SIMD types.  Note that depending on the target, SIMD vector size could be 16 bytes (on SSE2 machines) or 32-bytes (on AVX2 machines).  Therefore, these costs cannot be static constants, should be a function of vector size.

 When two successive indexed accesses of a SIMD vector take place why not RyuJIT optimize vextractf/shift operations generated?

```
offset = 2;
if (_vectorUlongSpan < 4 || (u = vector64[2]) == 0)
{
        offset = 3;
        if (_vectorUlongSpan < 4 || (u = vector64[3]) == 0)
        {
And the assembly it generated:

vextractf128 xmm1,ymm0,1  
vmovd       rdi,xmm1  
test        rdi,rdi  
jne         00007FFF07225841  
mov         esi,3  
vextractf128 xmm1,ymm0,1  
vpsrldq     ymm1,ymm1,8  
vmovd       rdi,xmm1  

One possible route is that SIMD Vector index operations are expanded into Extract and Shift operations early in front-end phase. CSE phase could evaluate Extract operation into a temp and replace all further occurrences that could be eliminated with the temp.

3) Loop unrolling 
To iterate over individual vector elements, one uses

for (int i=0; i < Vector<int>.Count; ++i)
{
    ....
    = V[i]
}
```

4) Elimination of GT_SIMD_CHK when the index is a non-const, in loops like above.

Vector indexed access using a non-constant would result in writing vector to memory and accessing the required element from memory.  When the loop is small enough, it might be beneficial to unroll loop so that SIMD vectors are indexed using constant indices.  One such opportunity is in Kestrel server FindFistEqualByte() method.

5) Loop hoisting of constant vectors

e.g.

```
// We can hoist the following constant vectors 
// in the below loop: Vector<T>.One, Vector<T>.Zero
// new Vector<T>(T val)
for(...)
{
      .. = Vector<int>.One + b;

     pi = new Vector<float>(3.1412);
     ....
    if (x == Vector<long>.Zero)
}
```

6) Promotion of structs containing SIMD type fields.
This is tracked by issue dotnet/coreclr#7508

category:cq
theme:vector-codegen
skill-level:intermediate
cost:large
impact:medium

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RyuJIT SIMD: front-end optimizations for SIMD types #6742

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

RyuJIT SIMD: front-end optimizations for SIMD types #6742

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions