Marking Vector128<T>.Count and Vector256<T>.Count as [Intrinsic]#24991
Conversation
|
Let me know if this misses the bar for 3.0, and I can close until after master opens back up. I opened it given the triviality of the fix. |
|
This was found today when debugging some code with a customer who found a difference between some code using |
…etting the simdSize and baseType
Does this include unrolling of for-loop with Just asking out of interest. |
AFAIK, the appropriate flags are being set (this is the |
|
What would be the need for such unrolling? In general you don't expect vectors to be accessed element by element a lot. |
|
@mikedn: I have no need, just out of interest. (for reduction there are often better ways). |
CarolEidt
left a comment
There was a problem hiding this comment.
LGTM, other than some minor suggestions. I think this meets the bar for 3.0, because it's pretty straightforward, and developers will expect this to be recognized.
| #define LPFLG_SIMD_LIMIT 0x0080 // iterator is compared with Vector<T>.Count (found in lpConstLimit) | ||
| #define LPFLG_SIMD_LIMIT \ | ||
| 0x0080 // iterator is compared with Vector<T>, Vector64<T>, Vector128<T>, or Vector256<T>.Count (found in | ||
| // lpConstLimit) |
There was a problem hiding this comment.
nit: This formatting looks pretty weird. Did jit-format doe this automatically? I think it would be OK if you aligned it as before, and manually split the constant to the next line.
There was a problem hiding this comment.
Yes, this is how the format patch fixed it up. I might just fix this up to say vector count as well to keep the comment shorter.
| if ((tree->gtFlags & GTF_ICON_SIMD_COUNT) != 0) | ||
| { | ||
| printf(" Vector<T>.Count"); | ||
| printf(" Vector<T>, Vector64<T>, Vector128<T>, or Vector256<T>.Count"); |
There was a problem hiding this comment.
Rather than making this so verbose, I think it would be perfectly reasonable to just print "Vector count".
There was a problem hiding this comment.
Would you prefer vector count or vector element count?
The latter seems to be more explicit as to what the count is, but I don't feel particularly strongly about it 😄
…ctor256_Count don't return nullptr
| } | ||
| else | ||
| { | ||
| baseType = getBaseTypeAndSizeOfSIMDType(clsHnd, &simdSize); |
There was a problem hiding this comment.
Had to change this to be here as the check right below (if (!varTypeIsArithmetic(baseType)) is what handles unsupported T and we were returning nullptr.
Validated that we now return the integer constant node, that the loop unrolling functionality works (cc. @gfoidl), and the codegen for the known cases is now "efficient".
There was a problem hiding this comment.
Maybe for 5.0, it would be nice to make the loop unrolling work for "real world" scenarios, such as for (int i = 0; i < data.Length; i += Vector128<T>.Count)...
There was a problem hiding this comment.
Will file one before merging.
There was a problem hiding this comment.
Looks like this is largely covered by both https://github.com/dotnet/coreclr/issues/11606 and https://github.com/dotnet/coreclr/issues/20486.
I've added comments to both of these instead.
…net/coreclr#24991) * Marking Vector128<T>.Count and Vector256<T>.Count as [Intrinsic] * Fixing NI_Vector128_Count and NI_Vector256_Count to use clsHnd when getting the simdSize and baseType * Applying the formatting patch. * Changing some comments to just be "vector element count". * Fixing impBaseIntrinsic to set the baseType so Vector128_Count and Vector256_Count don't return nullptr Commit migrated from dotnet/coreclr@9321692
CC. @CarolEidt.
The
Vector128<T>.CountandVector256<T>.Countmethods weren't marked as intrinsic so they:This PR marks them as
[Intrinsic]and hooks it up to the same handling asVector<T>.Count.