[Arm64] Initial HW intrinsic framework#15833
Conversation
| Vector64IntHandle = nullptr; | ||
| Vector64ShortHandle = nullptr; | ||
| Vector64ByteHandle = nullptr; | ||
| Vector64LongHandle = nullptr; |
There was a problem hiding this comment.
Removed the dimunitive cases where Vector64<T> == T
| #endif // _TARGET_XARCH_ | ||
| #ifdef _TARGET_ARM64_ | ||
| NamedIntrinsic lookupHWIntrinsic(const char* className, const char* methodName); | ||
| GenTree* impHWIntrinsic(NamedIntrinsic intrinsic, |
There was a problem hiding this comment.
It would be nice for X86 to use this too, but I left that for a separate PR.
| if (jitFlags.IsSet(JitFlags::flag)) \ | ||
| opts.setSupportedISA(InstructionSet_##isa); | ||
| #include "hwintrinsiclistArm64.h" | ||
|
|
| return retNode; | ||
| } | ||
|
|
||
| CORINFO_CLASS_HANDLE Compiler::gtGetStructHandleForHWSIMD(var_types simdType, var_types simdBaseType) |
There was a problem hiding this comment.
I think this could be merged with X86's copy and moved to a shared file. Saved for later
| // Return Value: | ||
| // a gtNewMustThrowException if mustExpand is true; otherwise, nullptr | ||
| // | ||
| GenTree* Compiler::impUnsupportedHWIntrinsic(unsigned helper, |
There was a problem hiding this comment.
This is an exact copy of X86's.
I think this could be merged with X86's copy and moved to a shared file. Saved for later
| CORINFO_METHOD_HANDLE method, | ||
| CORINFO_SIG_INFO* sig, | ||
| bool mustExpand); | ||
| GenTree* impUnsupportedHWIntrinsic(unsigned helper, |
There was a problem hiding this comment.
The signature looks identical between the two, can we share it?
There was a problem hiding this comment.
It should move to a shared file, but I was goiing to make that a separate PR. So it could be reviewed faster.
| if ((namespaceName != nullptr) && strcmp(namespaceName, "System.Runtime.Intrinsics.Arm.Arm64") == 0) | ||
| { | ||
| result = lookupHWIntrinsic(className, methodName); | ||
| } |
There was a problem hiding this comment.
We could agree on a common interface here. Move namespace name inside.
lookupHWIntrinsic(namespaceName, className, methodName)
This allows IsSupported and Platform not supported exceptions to be handled in JIT.
There was a problem hiding this comment.
This could be converted into switch this way as well
There was a problem hiding this comment.
Ex. start with System.Runtime.Intrinsics, switch on X86, Arm, and than next switch statements if possible,
There was a problem hiding this comment.
I was thinking we should eventually be using an unordered/hash dictionary lookup
There was a problem hiding this comment.
Actually there is one preference - to get it as fast as possible. @tannergooding and @fiigii were thinking about dictionary or switch, whichever would better work here. From this what I see it could be split into couple of switches or as you suggest lookup could be dictionary based.
| #endif // _TARGET_XARCH_ | ||
| #ifdef _TARGET_ARM64_ | ||
| NamedIntrinsic lookupHWIntrinsic(const char* className, const char* methodName); | ||
| GenTree* impHWIntrinsic(NamedIntrinsic intrinsic, |
There was a problem hiding this comment.
Same for this one, identical other than the name. Would be good to share if we can.
There was a problem hiding this comment.
I had initially renamed the X86 version, but I had too many merge conflicts. So I dropped it.
| impInlineInfo->retExprClassHndIsExact = false; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
This looks like a merge conflict resolve error
| (unsigned)CHECK_SPILL_ALL); | ||
|
|
||
| GenTree* tmpOp2 = gtNewLclvNode(lvaInlineeReturnSpillTemp, op2->TypeGet()); | ||
| GenTreePtr tmpOp2 = gtNewLclvNode(lvaInlineeReturnSpillTemp, op2->TypeGet()); |
There was a problem hiding this comment.
Another merge conflict resolve error
3996b95 to
8384c3c
Compare
| instruction ins = getOpForHWIntrinsic(node, baseType); | ||
| assert(ins != INS_invalid); | ||
|
|
||
| bool is16B = (node->gtSIMDSize > 8); |
There was a problem hiding this comment.
nit: name is16B could be confusing - B may mean bit or byte, it would be better to use is16Byte
| instruction ins = getOpForHWIntrinsic(node, baseType); | ||
| assert(ins != INS_invalid); | ||
|
|
||
| bool is16B = (node->gtSIMDSize > 8); |
8384c3c to
5d42803
Compare
CarolEidt
left a comment
There was a problem hiding this comment.
Looks good overall; a couple of minor things and I think the IL implementation needs to check IsSupported
| instruction ins = getOpForHWIntrinsic(node, baseType); | ||
| assert(ins != INS_invalid); | ||
|
|
||
| bool is16Byte = (node->gtSIMDSize > 8); |
There was a problem hiding this comment.
Just curious why this wouldn't be node->gtSIMDSize == 16?
There was a problem hiding this comment.
This code was drafted from the SIMD bin op code where 12 Byte vectors are treated as 16 byte. It is probably not needed here. Let me know if you want me to remove.
| { | ||
| enum Form | ||
| { | ||
| IsSupported, |
There was a problem hiding this comment.
Each of these should have a short comment describing them. Also, I'm not sure what the Unsupported value of Form would be used for.
| /// Vector add | ||
| /// Corresponds to vector forms of ARM64 ADD & FADD | ||
| /// </summary> | ||
| public static Vector128<byte> Add(Vector128<byte> left, Vector128<byte> right) => Add(left, right); |
There was a problem hiding this comment.
I believe that these should first check IsSupported and throw PNSE if not - then do the recursive call.
There was a problem hiding this comment.
@CarolEidt PlatformNotSupportedException is handled in the JIT to avoid replicating the code into every intrinsic.
Same with type checking for generics.
There was a problem hiding this comment.
PlatformNotSupportedException is handled in the JIT to avoid replicating the code into every intrinsic.
The issue is that we would like to support the scenario where the JIT doesn't have any support for intrinsics. In that case, the existing code would eventually get a stack overflow as it continues to recurse.
There was a problem hiding this comment.
I think this will require more discussion. I think this implementation is consistent with current X86 implementation.
There was a problem hiding this comment.
It is consistent with the current implementation, but not what is currently proposed in dotnet/designs#28
There was a problem hiding this comment.
The issue is that we would like to support the scenario where the JIT doesn't have any support for intrinsics. In that case, the existing code would eventually get a stack overflow as it continues to recurse.
Currently, this is handled by XXX.PlatformNotSupported.cs files that are used by unsupported platform build. For example, x64/x86 should build with Arm64/Simd.PlatformNotSupported.cs rather than Arm64/Simd.cs .
There was a problem hiding this comment.
@CarolEidt You own the design, but I thought that proposal was still in dispute.
So if we proceesd. I would add a function to every class
public static void ThrowIfUnsupported()
{
if (IsSupported == false)
{
throw new PlatformNotSupportedException();
}
}and methods would become
public static Vector128<byte> Add(Vector128<byte> left, Vector128<byte> right)
{
ThrowIfUnsupported();
return Add(left, right);
}
This looks like it is simple enough
There was a problem hiding this comment.
Of course IsSupported still wouldn't work on platfoms without JIT support.
We could then get rid of the redundant C# files like this
#if ARM64
private static bool _IsSupported { get => _IsSupported; }
public static bool IsSupported { get => JITSupportsIntrinsics() && _IsSupported; }
#else
public static bool IsSupported { get => false; }
#endifThere was a problem hiding this comment.
And of course if the JIT was Intrinsic namespace aware it could easily support eliminating #if
private static bool _IsSupported { get => _IsSupported; }
public static bool IsSupported { get => JITSupportsIntrinsics() && _IsSupported; }There was a problem hiding this comment.
@fiigii @sdmaclea - thanks for the clarifications. I haven't yet merged the design doc, and it seems there are further revisions needed.
If the XXX.PlatformNotSupported.cs approach is used, then it seems that we don't need to defend against the scenario I was describing. It would be good, though, if we had a way to catch the case where an expected intrinsics is not recognized, aside from stack overflow. Clearly that would be a bug, but the failure mode is pretty bad.
For the time being, I'l good with leaving these as-is.
| static const HWIntrinsicInfo hwIntrinsicInfoArray[] = { | ||
| {NI_ARM64_IsSupported_True, "get_IsSupported", IsaFlag::EveryISA, HWIntrinsicInfo::IsSupported, HWIntrinsicInfo::None, {}}, | ||
| {NI_ARM64_IsSupported_False, "::NI_ARM64_IsSupported_False", IsaFlag::EveryISA, HWIntrinsicInfo::IsSupported, HWIntrinsicInfo::None, {}}, | ||
| {NI_ARM64_PlatformNotSupported, "::NI_ARM64_PlatformNotSupported", IsaFlag::EveryISA, HWIntrinsicInfo::Unsupported, HWIntrinsicInfo::None, {}}, |
There was a problem hiding this comment.
@CarolEidt This is the use of the Unsupported category.
There was a problem hiding this comment.
Oh, I think I get it; thanks. This is for intrinsics that are for other platforms. A comment to that effect (where enum Form is declared) would be useful.
There was a problem hiding this comment.
Actually this is for intrinsics which are not supported on the running platform. So if the class reports IsSupported == false, all intrinsics of the class will return this intrinsic. This would include unsupported extensions, or JIT ISA flags which were disabled. It could also include other targets, but it does not yet.
| result = (hwIntrinsicInfoArray[i].intrinsicID == NI_ARM64_IsSupported_True) | ||
| ? NI_ARM64_PlatformNotSupported | ||
| : NI_ARM64_IsSupported_False; | ||
| } |
There was a problem hiding this comment.
@CarolEidt This is the logic to handle the unsupported classes.
The result = (hwIntrinsicInfoArray[i].intrinsicID == NI_ARM64_IsSupported_True) line is incorrect.
It should be result = (hwIntrinsicInfoArray[i].intrinsicID != NI_ARM64_IsSupported_True)
It probably warrants a few comments
| return gtNewIconNode((intrinsic == NI_ARM64_IsSupported_True) ? 1 : 0); | ||
|
|
||
| case HWIntrinsicInfo::Unsupported: | ||
| return impUnsupportedHWIntrinsic(CORINFO_HELP_THROW_PLATFORM_NOT_SUPPORTED, method, sig, mustExpand); |
There was a problem hiding this comment.
@CarolEidt Here is the other part of the logic to handle unsupported.
| HARDWARE_INTRINSIC(NI_ARM64_SIMD_BitwiseAndNot, Simd, AndNot, SimdBinaryOp, INS_bic, INS_bic, INS_bic, None ) | ||
| HARDWARE_INTRINSIC(NI_ARM64_SIMD_BitwiseOr, Simd, Or, SimdBinaryOp, INS_orr, INS_orr, INS_orr, None ) | ||
| HARDWARE_INTRINSIC(NI_ARM64_SIMD_BitwiseOrNot, Simd, OrNot, SimdBinaryOp, INS_orn, INS_orn, INS_orn, None ) | ||
| HARDWARE_INTRINSIC(NI_ARM64_SIMD_BitwiseNot, Simd, Not, SimdBinaryOp, INS_not, INS_not, INS_not, None ) |
4201eb2 to
ddaf4ec
Compare
|
@CarolEidt I made the changes you suggested. Tested many intrinsics using dotnet/corefx#26580 with #16008 . All implemented tests except Vector64 tests are passing.
I think Short Vector ABI support can be in separate PR. It will likely be complicated. |
ddaf4ec to
0198a07
Compare
There was a problem hiding this comment.
All enabled #16008 tests are now passing.
Vector64<T> tests were fixed w/o adding ShortVector ABI support. ShortVector ABI is separate issue #16021. I will not be working on that immediately.
CompareLessThanZero will fail when T is unsigned. These tests are commented out in #16008. I will work on that in a subsequent PR.
Insert, Extract, SetAllVector*, and BitwiseSelect are NYI. I will work on that in a subsequent PR.
@CarolEidt As I mentioned previously, I believe all your issues have been addressed. PTAL
| op = gtNewScalarHWIntrinsicNode(info.compRetNativeType, op, NI_ARM64_NONE_MOV); | ||
| } | ||
| } | ||
| #endif |
There was a problem hiding this comment.
I added this workaround to make Vector64<T> work.
0198a07 to
27173f5
Compare
|
test this please |
|
@CarolEidt Ping |
|
timeout |
This is the basic framework for a table driven ARM64 HW Intrinsic framework
It has only been tested for the SIMD add case, but it does work end to end
@CarolEidt @tannergooding @4creators
cc/ @fiigii @dotnet/jit-contrib