Skip to content

Suboptimal codegen for Enum.HasFlag(A) || Enum.HasFlag(B) #93954

@gdkchan

Description

@gdkchan

Description

Checking if multiple flags are set using Enum.HasFlag currently produces suboptimal codegen.

Consider the following C# code:

    [Flags]
    public enum TestFlags
    {
        FlagA = 1 << 0,
        FlagB = 1 << 1,
    }

    public static bool HasEither(TestFlags flags)
    {
        return flags.HasFlag(TestFlags.FlagA) || flags.HasFlag(TestFlags.FlagB);
    }

    public static bool HasEither2(TestFlags flags)
    {
        return (flags & TestFlags.FlagA) != 0 || (flags & TestFlags.FlagB) != 0;
    }

    public static bool HasEither3(TestFlags flags)
    {
        return (flags & (TestFlags.FlagA | TestFlags.FlagB)) != 0;
    }

On .NET 8, each method seems to have different code, with HasEither3 being the most optimized.

C:HasEither(int):ubyte (FullOpts):
       test     dil, 1
       jne      SHORT G_M24457_IG05
       xor      eax, eax
       test     dil, 2
       setne    al
       ret      
G_M24457_IG05:
       mov      eax, 1
       ret      

C:HasEither2(int):ubyte (FullOpts):
       mov      eax, edi
       and      eax, 1
       and      edi, 2
       or       eax, edi
       setne    al
       movzx    rax, al
       ret      

C:HasEither3(int):ubyte (FullOpts):
       xor      eax, eax
       test     dil, 3
       setne    al
       ret   

Ideally, all 3 would produce code identical to that of HasEither3.
Interestingly, it seems .NET 7 produces the same code for HasEither and HasEither2, but .NET 8 manages to produce branchless code for HasEither2 (but not for the first one for some reason).

Another variant of this is checking if both flags are set, rather than if either one of them is set:

    public static bool HasBoth(TestFlags flags)
    {
        return flags.HasFlag(TestFlags.FlagA) && flags.HasFlag(TestFlags.FlagB);
    }

    public static bool HasBoth2(TestFlags flags)
    {
        return (flags & TestFlags.FlagA) != 0 && (flags & TestFlags.FlagB) != 0;
    }

    public static bool HasBoth3(TestFlags flags)
    {
        return (flags & (TestFlags.FlagA | TestFlags.FlagB)) == (TestFlags.FlagA | TestFlags.FlagB);
    }

This produces the same code for HasBoth and HasBoth2, which is slightly worse than HasBoth3:

C:HasBoth(int):ubyte (FullOpts):
       test     dil, 1
       je       SHORT G_M13023_IG05
       xor      eax, eax
       test     dil, 2
       setne    al
       ret      
G_M13023_IG05:
       xor      eax, eax
       ret      

C:HasBoth2(int):ubyte (FullOpts):
       test     dil, 1
       je       SHORT G_M55757_IG05
       xor      eax, eax
       test     dil, 2
       setne    al
       ret      
G_M55757_IG05:
       xor      eax, eax
       ret      

C:HasBoth3(int):ubyte (FullOpts):
       and      edi, 3
       xor      eax, eax
       cmp      edi, 3
       sete     al
       ret      

This can be increased with many flags, in which case it keeps adding more branches and increasing the size of the first 2 methods, while the size of the 3rd one remains unchanged (as it only needs to test with a single constant).

This has been verified on Godbolt Compiler Explorer: https://godbolt.org/z/oWhcYd8zr

Configuration

Seems to happen on .NET 7 and .NET 8 on x64 on Windows.

Regression?

No, AFAIK RyuJIT was never able to optimize this code pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMItenet-performancePerformance related issue

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions