Fix method names of hardware intrinsic APIs#25965
Conversation
| @@ -198,7 +197,7 @@ public static class Avx | |||
| public static Vector256<float> Set(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0) { throw null; } | |||
| public static Vector256<double> Set(double e3, double e2, double e1, double e0) { throw null; } | |||
| public static Vector256<T> Set1<T>(T value) where T : struct { throw null; } | |||
There was a problem hiding this comment.
Should this rather be called SetOne ? For consistency with SetZero, Vector<T>.One, Vector<T>.Zero.
There was a problem hiding this comment.
I thought that Set1 should not be consistent with SetZero or Vector<T>.One.
SetZerostands for "set all elements to value of zero";Vector<T>.Onestands for "set all elements to value of one";Set1stands for "set all elements to one value of XX";Setstands for "set elements to multiple values of XX, YY, ZZ, ...";
But I know Set1 is not a good name that just follows C++ Intel intrinsic naming. Do you have suggestions for a better name?
There was a problem hiding this comment.
Just Set or SetAll might make sense.
There was a problem hiding this comment.
Just Set or SetAll might make sense.
We have Set for multi-value initialization. SetAll makes sense to me. Thank you.
| public static Vector256<float> Permute2x128(Vector256<float> left, Vector256<float> right, byte control) { throw null; } | ||
| public static Vector256<double> Permute2x128(Vector256<double> left, Vector256<double> right, byte control) { throw null; } | ||
| public static Vector256<T> Permute2x128<T>(Vector256<T> left, Vector256<T> right, byte control) where T : struct { throw null; } | ||
| public static Vector128<float> PermuteVar(Vector128<float> left, Vector128<float> mask) { throw null; } |
There was a problem hiding this comment.
Should this be called PermuteVariable? For consitency with BlendVariable.
Or do these two methods need the Var/Variable suffix at all? Would overload be sufficient?
There was a problem hiding this comment.
We are using the Variable suffix only for v-suffixed instructions, e.g., BlendVariable -> vblendvp*, ShiftLeftLogicalVariable->vpsllv*, etc.
PermuteVar and PermuteVar8x32 is a special case that will generate vpermilp* and vperm*, which breaks the above convention, so it is following C++ Intel intrinsic naming.
We can change it to Variable suffix, I have no strong preference here.
| public static Vector128<float> ReciprocalSquareRoot(Vector128<float> value) { throw new NotImplementedException(); } | ||
| public static Vector128<float> ReciprocalSqrt(Vector128<float> value) { throw new NotImplementedException(); } | ||
| public static Vector128<float> Set(float e3, float e2, float e1, float e0) { throw new NotImplementedException(); } | ||
| public static Vector128<float> Set1(float value) { throw new NotImplementedException(); } |
|
For my education, what was the rule used to make the method generic vs. non-generic? For example, I wondering about these:
|
|
I am closing this because of I have cherry picked this into #25969. We can continue the discussion about the naming though. |
Because SSE only has |
Sometimes, |
Is this a common pattern? It does not sound right to be optimizing for case where folks have e.g. |
@jkotas Ok, I will fix this and |
@jkotas @fiigii |
|
@4creators, it is an optimization. It just avoids the consumer needing to insert a |
|
I believe that these generic intrinsic is a "legacy" design from the long design process, and the original motivation has gone due to other changes. I will fix it soon, thanks for pointing out! |
|
@tannergooding my point is that (E)(V)EXTRACTPS instruction does not check if operand is of xmm packed float type and it does not throw any type of floating point exception either, therefore, it can be treated as a general extraction of 32 bits from xmm vector. If we introduce any type of limitations which limit |
float ExtractSingle<T>(Vector128<T> value, byte index) where T : struct { throw null; }
Vector128<Int64> src = //... //
var f = Sse41.ExtractSingle<Int64>(src, 3);Should be perfectly legal - it saves 1 CPU cycle by doing extraction and cast (strange binary one but still), and I may want to use Int64 for loading to get atomic load for adjacent two 32bit values and do some precalculation step with it (again at binary level). |
|
We are not providing a raw api, however. We are providing a managed abstraction over the underlying hardware instructions. Because it is an abstraction and not raw access to the underlying instructions, there are some helper functions being provided (like static cast, set 1, etc) and other by design limitations set forth (such as no MMX instructions being exposed). |
|
@4creators Thank you for explaining my original design proposal 😄 . However, after I added Vector128<Int64> src = //... //
Vector128<Single> srcFloat = Sse.StaticCast<Int64, Single>(src);
var f = Sse41.ExtractSingle(srcFloat, 3);
|
|
Updated the above code example. |
|
Ahh my ... that was a good one on my side 😆 |
Matching the CoreCLR change dotnet/coreclr#15471
cc @jkotas @eerhardt @tannergooding