Skip to content

JIT - slow generated code on Release for iterating simple array of struct #4659

@krwq

Description

@krwq

Byte array iteration code:

for (int i = 0; i < Iterations; i++)
{
    unchecked
    {
        for (int j = 0; j < ArraySize; j++)
        {
            arr[j] = 123;
        }
    }
}

Struct iteration code

[StructLayout(LayoutKind.Sequential, Pack = 1, Size = 1)]
struct MyByte
{
    private readonly byte _byte;
    public MyByte(byte b)
    {
        _byte = b;
    }

    public byte Value
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        get { return _byte; }
    }
}
// ...
for (int i = 0; i < Iterations; i++)
{
    unchecked
    {
        for (int j = 0; j < ArraySize; j++)
        {
            arr[j] = new MyByte(123);
        }
    }
}

Generated code (byte array):

                    for (int j = 0; j < ArraySize; j++)
00523003  xor         edx,edx  
00523005  mov         ecx,dword ptr [esi+4]  
                    {
                        arr[j] = 123;
00523008  cmp         edx,ecx  
0052300A  jae         0052304F  
0052300C  mov         byte ptr [esi+edx+8],7Bh  
                    for (int j = 0; j < ArraySize; j++)
00523011  inc         edx  
00523012  cmp         edx,0F4240h  
00523018  jl          00523008  
            for (int i = 0; i < Iterations; i++)
0052301A  inc         edi  
0052301B  cmp         edi,7D0h  
00523021  jl          00523003  
                    }

Generated code for struct - (expecting similar generated code - this is the simplest optimization you can think of here, I even provided struct size and readonly to make it easier)

                    for (int j = 0; j < ArraySize; j++)
00522F0B  xor         edx,edx  
00522F0D  mov         ecx,dword ptr [edi+4]  
00522F10  mov         ebx,7Bh  
                    {
                        arr[j] = new MyByte(123);
00522F15  cmp         edx,ecx  
00522F17  jae         00522F63  
00522F19  lea         esi,[edi+edx+8]  
00522F1D  mov         byte ptr [esi],bl  
                    for (int j = 0; j < ArraySize; j++)
00522F1F  inc         edx  
00522F20  cmp         edx,0F4240h  
00522F26  jl          00522F10  
            for (int i = 0; i < Iterations; i++)
00522F28  inc         dword ptr [ebp-14h]  
00522F2B  cmp         dword ptr [ebp-14h],7D0h  
00522F32  jl          00522F0B  
                    }

This is more than 30% slower and this is just a simplest case.

category:cq
theme:assertion-prop
skill-level:expert
cost:extra-large

Metadata

Metadata

Assignees

No one assigned

    Labels

    JitUntriagedCLR JIT issues needing additional triagearea-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIenhancementProduct code improvement that does NOT require public API changes/additionsoptimizationtenet-performancePerformance related issue

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions