Implement AVX/AVX2/SSE3 Load* intrinsics#16200
Conversation
|
test Windows_NT x64 Checked jitincompletehwintrinsic test Windows_NT x86 Checked jitincompletehwintrinsic test Ubuntu x64 Checked jitincompletehwintrinsic test OSX10.12 x64 Checked jitincompletehwintrinsic |
| HARDWARE_INTRINSIC(SSE41_IsSupported, "get_IsSupported", SSE41, -1, 0, 0, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_IsSupportedProperty, HW_Flag_NoFlag) | ||
| HARDWARE_INTRINSIC(SSE41_Multiply, "Multiply", SSE41, -1, 16, 2, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_pmuldq, INS_invalid, INS_invalid, INS_invalid}, HW_Category_SimpleSIMD, HW_Flag_Commutative) | ||
| HARDWARE_INTRINSIC(SSE41_BlendVariable, "BlendVariable", SSE41, -1, 16, 3, {INS_pblendvb, INS_pblendvb, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_blendvps, INS_blendvpd}, HW_Category_SimpleSIMD, HW_Flag_NoFlag) | ||
| HARDWARE_INTRINSIC(SSE41_LoadAlignedVector128NonTemporal, "LoadAlignedVector128NonTemporal", SSE41, -1, 16, 1, {INS_movntdqa, INS_movntdqa, INS_movntdqa, INS_movntdqa, INS_movntdqa, INS_movntdqa, INS_movntdqa, INS_movntdqa, INS_invalid, INS_invalid}, HW_Category_MemoryLoad, HW_Flag_NoFlag) |
There was a problem hiding this comment.
nit: alignment of fields is off
|
It would be useful to uncomment the The only thing blocking them (to my knowledge) was the |
|
Are you going to handle the |
Yes, I will submit the store PR tomorrow. |
|
Yes, I will implement all Sse2.Store*. |
|
@CarolEidt @tannergooding RyuJIT emitter seems not support SSE4.1 Can we merge these |
|
I agree that dropping the the SSE4.1 4-byte instructions in particular are blocked until https://github.com/dotnet/coreclr/issues/15908 is resolved. |
|
test Windows_NT x64 Checked jitincompletehwintrinsic test Windows_NT x86 Checked jitincompletehwintrinsic test Ubuntu x64 Checked jitincompletehwintrinsic test OSX10.12 x64 Checked jitincompletehwintrinsic |
|
@tannergooding Do you know why do these OSX tests fail? There is no error message. |
|
Looks like the machines had network issues. I have requeued the jobs. |
|
Thank you. Looks like unrelated failures. |
|
@CarolEidt Does this PR look good to you? |
It would be good to know if you have done that. If not, could you open an issue to enable the template for those? |
CarolEidt
left a comment
There was a problem hiding this comment.
LGTM, but the conflicts need to be resolved before merging.
|
@tannergooding @CarolEidt Thank for the review, I logged the containment test work at https://github.com/dotnet/coreclr/issues/16244. Will submit a PR soon. |
|
@CarolEidt May I have the permission to add PR/issues to "Hardware Intrinsic Project" if possible? That may save your time 😄 |
I'm not sure if that's possible - I don't see any permission settings, so I suspect that it uses the repo permisions. |
Implement AVX/AVX2/SSE3 Load* intrinsics Commit migrated from dotnet/coreclr@5e94fd1
@CarolEidt @tannergooding PTAL