A good portion of X86 ISAs all follow a hierarchy of support. If SSE2 is supported, then implicitly SSE is supported. Similarly, if AVX2 is supported, then AVX is supported as well.
We should design our HW Intrinsics classes in a similar fashion. This allows for simplicity in coding calling functions from 2 (or more) instruction sets:
if (SSE3.IsSupported)
{
var v1 = SSE3.SetAllVector128(0.5f); // comes from SSE
var v2 = SSE3.HorizontalAdd(v1, v1); // comes from SSE3
}
This means we should have our classes inherit from each other to model this inheritance in the ISAs:
public class Sse
{ ...
}
public class Sse2 : Sse
{ ...
}
public class Sse3 : Sse2
{ ...
}
The IsSupported hierarchy checks are doc'd in https://software.intel.com/en-us/articles/intel-sdm, for example:
12.4.4 Programming SSE3 with SSE/SSE2 Extensions
SIMD instructions in SSE3 extensions are intended to complement the use of SSE/SSE2 in programming SIMD applications. Application software that intends to use SSE3 instructions should also check for the availability of SSE/SSE2 instructions.
The hierarchy is also checked in the .NET runtime: https://github.com/dotnet/coreclr/blob/6bf04a47badd74646e21e70f4e9267c71b7bfd08/src/vm/codeman.cpp#L1307.
There is a bit of an open question: What should AVX inherit from? Should it inherit from SSE/SSE2/SSE4.1/SSE4.2? The Intel docs don't seem to indicate any SSE is necessary to support AVX, but the .NET runtime will only support AVX if SSE2 is present.
NOTE Taking this proposal would mean we will need to drop static class from all our classes, since static class cannot form an inheritance hierarchy.
For the original problem of SetZeroVector128, we have the following options:
- Explode the method out, but since there is no argument, we would have to append the type names to each method:
SetZeroVector128Float SetZeroVector128Double, SetZeroVector128Int32, etc.
- Leave the API as it is, but add support
Sse2.SetZeroVector128<float>(), which would match the behavior of Sse.SetZeroVector128().
- Since it is a helper, and not a real intrinsic, move the API to another class. Other helper methods could follow suit: like
SetAllVector128, SetVector128, etc.
- Potentially
Vector128<T>.Zero;, new Vector128<T>();, or default.
- Exposing a generic SetZeroVector128 in Sse class, and generate different code for underlying hardware support
- SSE: emitxorps for all base type that provides correct semantics but maybe with a bit data-bypass penalty on integer base types.
- SSE2: emit xorpd, xorps, or pxor for different base types.
Original proposal:
This was discussed during the API design (dotnet/corefx#23489 (comment)) but has come under discussion again in the context of dotnet/coreclr#17691.
A related issue is, if the APIs remain separate, how to deal with an API, e.g. SetZeroVector128 that provides support for float vectors in SSE, but supports the full range of generic types in SSE2.
A good portion of X86 ISAs all follow a hierarchy of support. If
SSE2is supported, then implicitlySSEis supported. Similarly, ifAVX2is supported, thenAVXis supported as well.We should design our HW Intrinsics classes in a similar fashion. This allows for simplicity in coding calling functions from 2 (or more) instruction sets:
This means we should have our classes inherit from each other to model this inheritance in the ISAs:
The
IsSupportedhierarchy checks are doc'd in https://software.intel.com/en-us/articles/intel-sdm, for example:The hierarchy is also checked in the .NET runtime: https://github.com/dotnet/coreclr/blob/6bf04a47badd74646e21e70f4e9267c71b7bfd08/src/vm/codeman.cpp#L1307.
There is a bit of an open question: What should
AVXinherit from? Should it inherit fromSSE/SSE2/SSE4.1/SSE4.2? The Intel docs don't seem to indicate any SSE is necessary to support AVX, but the .NET runtime will only support AVX if SSE2 is present.NOTE Taking this proposal would mean we will need to drop
static classfrom all our classes, sincestatic classcannot form an inheritance hierarchy.For the original problem of
SetZeroVector128, we have the following options:SetZeroVector128FloatSetZeroVector128Double,SetZeroVector128Int32, etc.Sse2.SetZeroVector128<float>(), which would match the behavior ofSse.SetZeroVector128().SetAllVector128,SetVector128, etc.Vector128<T>.Zero;,new Vector128<T>();, ordefault.Original proposal:
This was discussed during the API design (dotnet/corefx#23489 (comment)) but has come under discussion again in the context of dotnet/coreclr#17691.
A related issue is, if the APIs remain separate, how to deal with an API, e.g.
SetZeroVector128that provides support for float vectors in SSE, but supports the full range of generic types in SSE2.