-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Description
Small demonstration of the issue:
struct S {
public int Prop { get => 42; }
public S(string a) { }
}
public class Test {
public int Fast() {
return new S("dummy").Prop;
// mov eax, 0x2a
// ret
}
public int Slow() {
return new S().Prop;
// push rax
// xor eax, eax
// mov [rsp], rax
// lea rax, [rsp]
// mov byte ptr [rax], 0
// mov eax, 0x2a
// add rsp, 8
// ret
}
}Paradoxically, calling a struct constructor is way significantly faster than zero-initializing it (either using new S() or default(T)). However, calling the constructor couldn't be done so easily if S would be generic argument.
Obviously, it's not a huge deal, only a few more instructions. However, it seems like fairly low hanging fruit and since initializing structs is a common operation, it could have some impact.
Configuration
.NET 6 x64, can be replicated in sharplab
More context
Note that replacing struct S with class S also solves the issue, so surprisingly class is faster than struct 🙄
We discovered this issue when trying to implement fixed point numbers parametrized by a shift constant. We need to make the parameter a struct, as reference type parameters are not specialized. Later we discovered that using the new() constraint produces better code on .NET 6, even though it creates the instance using Activator - but this solution is far from ideal, because it performs very poorly on the old .NET Framework.
interface IFixedTParams {
int Shift { get; }
}
struct Q8_24 : IFixedTParams {
int IFixedTParams.Shift { get => 24; }
}
// [JitGeneric(typeof(Q8_24))]
class Fixed<TParams> where TParams : IFixedTParams, new() {
private readonly int rawValue;
public int ToIntFast() {
int shift = new TParams().Shift;
return rawValue >> shift;
}
public int ToIntSlow() {
int shift = default(TParams).Shift;
return rawValue >> shift;
}
}Since the whole point of doing fixed point arithmetic is to be a bit faster than floating point math, you can see why these few extra copy instructions are a problem in this use case.