-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Description
Checking if multiple flags are set using Enum.HasFlag currently produces suboptimal codegen.
Consider the following C# code:
[Flags]
public enum TestFlags
{
FlagA = 1 << 0,
FlagB = 1 << 1,
}
public static bool HasEither(TestFlags flags)
{
return flags.HasFlag(TestFlags.FlagA) || flags.HasFlag(TestFlags.FlagB);
}
public static bool HasEither2(TestFlags flags)
{
return (flags & TestFlags.FlagA) != 0 || (flags & TestFlags.FlagB) != 0;
}
public static bool HasEither3(TestFlags flags)
{
return (flags & (TestFlags.FlagA | TestFlags.FlagB)) != 0;
}On .NET 8, each method seems to have different code, with HasEither3 being the most optimized.
C:HasEither(int):ubyte (FullOpts):
test dil, 1
jne SHORT G_M24457_IG05
xor eax, eax
test dil, 2
setne al
ret
G_M24457_IG05:
mov eax, 1
ret
C:HasEither2(int):ubyte (FullOpts):
mov eax, edi
and eax, 1
and edi, 2
or eax, edi
setne al
movzx rax, al
ret
C:HasEither3(int):ubyte (FullOpts):
xor eax, eax
test dil, 3
setne al
ret Ideally, all 3 would produce code identical to that of HasEither3.
Interestingly, it seems .NET 7 produces the same code for HasEither and HasEither2, but .NET 8 manages to produce branchless code for HasEither2 (but not for the first one for some reason).
Another variant of this is checking if both flags are set, rather than if either one of them is set:
public static bool HasBoth(TestFlags flags)
{
return flags.HasFlag(TestFlags.FlagA) && flags.HasFlag(TestFlags.FlagB);
}
public static bool HasBoth2(TestFlags flags)
{
return (flags & TestFlags.FlagA) != 0 && (flags & TestFlags.FlagB) != 0;
}
public static bool HasBoth3(TestFlags flags)
{
return (flags & (TestFlags.FlagA | TestFlags.FlagB)) == (TestFlags.FlagA | TestFlags.FlagB);
}This produces the same code for HasBoth and HasBoth2, which is slightly worse than HasBoth3:
C:HasBoth(int):ubyte (FullOpts):
test dil, 1
je SHORT G_M13023_IG05
xor eax, eax
test dil, 2
setne al
ret
G_M13023_IG05:
xor eax, eax
ret
C:HasBoth2(int):ubyte (FullOpts):
test dil, 1
je SHORT G_M55757_IG05
xor eax, eax
test dil, 2
setne al
ret
G_M55757_IG05:
xor eax, eax
ret
C:HasBoth3(int):ubyte (FullOpts):
and edi, 3
xor eax, eax
cmp edi, 3
sete al
ret This can be increased with many flags, in which case it keeps adding more branches and increasing the size of the first 2 methods, while the size of the 3rd one remains unchanged (as it only needs to test with a single constant).
This has been verified on Godbolt Compiler Explorer: https://godbolt.org/z/oWhcYd8zr
Configuration
Seems to happen on .NET 7 and .NET 8 on x64 on Windows.
Regression?
No, AFAIK RyuJIT was never able to optimize this code pattern.