Skip to content

Optimize byte type comparisons #6861

@sivarv

Description

@sivarv

Example1:
Kestrel server hot method Seek() method has the following doubly-nested do-while

                    do
                    {
                        if (*pCurrent == byte0)
                        {
                            _block = block;
                            _index = index;
                            return byte0;
                        }
                        if (*pCurrent == byte1)
                        {
                            _block = block;
                            _index = index;
                            return byte1;
                        }
                        if (*pCurrent == byte2)
                        {
                            _block = block;
                            _index = index;
                            return byte2;
                        }
                        pCurrent++;
                        index++;
                    } while (pCurrent != pEnd);

Here is the code generated for one "if-stmt" block

IN00a6: 0002C6 movzx    rdx, byte  ptr [r8]
IN00a7: 0002CA mov      r9d, dword ptr [V12 rsp+68H]
IN00a8: 0002CF cmp      edx, r9d
IN00a9: 0002D2 jne      SHORT G_M39889_IG19
IN00aa: 0002D4 mov      rcx, rbp
IN00ab: 0002D7 mov      rdx, r14
IN00ac: 0002DA call     CORINFO_HELP_CHECKED_ASSIGN_REF
IN00ad: 0002DF mov      dword ptr [rbp+8], r15d
IN00ae: 0002E3 mov      r12d, dword ptr [V12 rsp+68H]
IN00af: 0002E8 mov      eax, r12d
IN00b0: 0002EB jmp      SHORT G_M39889_IG24

Here *pCurrent = byte GT_IND
byte0/byte1/byte2 = are the bytes obtained using SIMD Get[i] intrinsic from Vector.

Here there is an opportunity to replace the first 3 instructions with

cmp byte ptr[r8], r9l

Example2:
@benaadams has the following PR open that updates Seek() method to accept byte type params and contruct Vector within Seek() method.
aspnet/KestrelHttpServer#1138

Now byte0/byte1/byte2 are byte type args. Here is the jitted code for one of the "if-stmt" blocks post @benaadams's change:

G_M3344_IG30:
       0FB602               movzx    rax, byte  ptr [rdx]
       440FB6C7             movzx    r8, dil
       413BC0               cmp      eax, r8d
       7515                 jne      SHORT G_M3344_IG31
       498BCE               mov      rcx, r14
       498BD7               mov      rdx, r15
       E8C49C965F           call     CORINFO_HELP_CHECKED_ASSIGN_REF
       45896608             mov      dword ptr [r14+8], r12d
       400FB6C7             movzx    rax, dil
       EB6E                 jmp      SHORT G_M3344_IG36

Here too we have an opportunity to generate

cmp byte ptr[rdx] dil

The IR pattern here seems to be GT_EQ/NE(ubyte GT_IND, op2 = GT_CAST from ubyte to int).

Instead of doing IR pattern based optimization, we could think of a more generic approach to this issue on the following lines:

Rough algorithm:
Foreach  node bottom up
    If unary and op1 is GT_CAST and cast RHS type is valid for operation, then if result, rewrite operation in terms of cast rhs and GT_CAST result to original type.
    If binary and either op1/op2is GT_CAST and the cast rhs types match the op types and the type is valid for the operation, then if result, rewrite operation in terms of the rhs type and cast result to original type.

This would push converts up the tree only in the cases where it’s closed for the operation (no overflow etc) and we’d reduce a lot of these extra converts.

category:cq
theme:basic-cq
skill-level:intermediate
cost:medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsoptimizationtenet-performancePerformance related issue

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions