Moving a number of the Sse2 hwintrinsic tests to use the test template. by tannergooding · Pull Request #16192 · dotnet/coreclr

tannergooding · 2018-02-04T02:23:36Z

This also implements the LoadVector128, LoadAlignedVector128, and LoadScalarVector128 intrinscis for Sse2.

…LoadScalarVector128 intrinsics.

…adScalarVector128 intrinsics.

tannergooding · 2018-02-04T02:23:48Z

tannergooding · 2018-02-04T02:25:11Z

 HARDWARE_INTRINSIC(SSE2_ConvertToVector128Single,                    "ConvertToVector128Single",                         SSE2,       -1,           16,          1,            {INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_cvtdq2ps,  INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_cvtpd2ps},          HW_Category_SimpleSIMD,                        HW_Flag_BaseTypeFromArg)
 HARDWARE_INTRINSIC(SSE2_Divide,                                      "Divide",                                           SSE2,       -1,           16,          2,            {INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_divpd},             HW_Category_SimpleSIMD,                        HW_Flag_NoFlag)
+HARDWARE_INTRINSIC(SSE2_LoadAlignedVector128,                        "LoadAlignedVector128",                             SSE2,       -1,           16,          1,            {INS_movdqa,    INS_movdqa,    INS_movdqa,    INS_movdqa,    INS_movdqa,    INS_movdqa,    INS_movdqa,    INS_movdqa,    INS_invalid,   INS_movapd},            HW_Category_MemoryLoad,                        HW_Flag_NoFlag)
+HARDWARE_INTRINSIC(SSE2_LoadScalarVector128,                         "LoadScalarVector128",                              SSE2,       -1,           16,          1,            {INS_invalid,   INS_invalid,   INS_invalid,   INS_invalid,   INS_movd,      INS_movd,      INS_movq,      INS_movq,      INS_invalid,   INS_movsdsse2},         HW_Category_MemoryLoad,                        HW_Flag_NoFlag)


@fiigii, the 8-bit and 16-bit overloads in particular don't have any backing instruction support (there is no movb or movw, or equivalent that I can find), was this intentional?

I have no idea about ISA design. AFAIK, the performance of partial register access is tricky on Intel architectures, especially <32-bit.

I was asking since you exposed these particular LoadScalar APIs. I was wondering if it was intentional (and you expect a movd+clear unnecessary bits) or if it was just an oversight at the time (and we should remove these 4 APIs in particular).

it was just an oversight at the time

Yes, I think that is an oversight. LoadScalarVector128 should just work on float/double.

Looks like we should remove integer LoadScalarVector128 and add Vector128<long/ulong> LoadLow(Vector128<long/ulong> upper, double* address).

I think we want LoadScalar for int/unit and long/ulong.

The movd and movq instructions will explicitly clear upper, rather than setting it from one of the source operands

I think we want LoadScalar for int/unit and long/ulong.

Yes, I just looked that there are a few Scalar APIs work on int/long.

The movd and movq instructions will explicitly clear upper, rather than setting it from one of the source operands

That is okay, which makes consistent with C++ intrinsic semantics.

That is okay, which makes consistent with C++ intrinsic semantics.

Yes, I was just indicating why we couldn't use them for LoadLower or LoadUpper, since those explicitly take a lower and upper value for the "other" bits.

I'll submit a separate PR to remove the 4 overloads that are not valid.

@fiigii, did you have any other feedback or does this look good to you?

Looks pretty good. Thanks for indicating this mistake.

tannergooding · 2018-02-04T02:27:12Z

The product changes (+20/-9) are all in the first commit. The last two commits (+813/-0 and +25,955/-3,718, respectively) are purely test changes.

tannergooding · 2018-02-04T02:40:48Z

test Windows_NT x64 Checked jitincompletehwintrinsic
test Windows_NT x64 Checked jitx86hwintrinsicnoavx
test Windows_NT x64 Checked jitx86hwintrinsicnoavx2
test Windows_NT x64 Checked jitx86hwintrinsicnosimd
test Windows_NT x64 Checked jitnox86hwintrinsic

test Windows_NT x86 Checked jitincompletehwintrinsic
test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnoavx2
test Windows_NT x86 Checked jitx86hwintrinsicnosimd
test Windows_NT x86 Checked jitnox86hwintrinsic

test Ubuntu x64 Checked jitincompletehwintrinsic
test Ubuntu x64 Checked jitx86hwintrinsicnoavx
test Ubuntu x64 Checked jitx86hwintrinsicnoavx2
test Ubuntu x64 Checked jitx86hwintrinsicnosimd
test Ubuntu x64 Checked jitnox86hwintrinsic

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

4creators · 2018-02-05T13:07:33Z

@tannergooding I am going to wait for this PR before submitting my PRs with remaining SSE2 intrinsics modulo #16200

CarolEidt

LGTM

tannergooding added 2 commits February 3, 2018 12:12

Adding support for the SSE2 LoadVector128, LoadAlignedVector128, and …

d32089f

…LoadScalarVector128 intrinsics.

Adding tests for the SSE2 LoadVector128, LoadAlignedVector128, and Lo…

a034c50

…adScalarVector128 intrinsics.

tannergooding commented Feb 4, 2018

View reviewed changes

Moving a number of the Sse2 hwintrinsic tests to use the test template.

0d8ed19

jkotas added the area-CodeGen label Feb 4, 2018

tannergooding mentioned this pull request Feb 6, 2018

Removing the Sse2.LoadScalarVector128 overloads that are invalid. #16221

Merged

CarolEidt approved these changes Feb 6, 2018

View reviewed changes

tannergooding merged commit 4578505 into dotnet:master Feb 6, 2018

4creators mentioned this pull request Jan 31, 2020

Lack of coordination and overlapping work while implementing Intel hardware intrinsics dotnet/runtime#9656

Closed

Conversation

tannergooding commented Feb 4, 2018

Uh oh!

tannergooding commented Feb 4, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tannergooding commented Feb 4, 2018

Uh oh!

tannergooding commented Feb 4, 2018

Uh oh!

4creators commented Feb 5, 2018

Uh oh!

CarolEidt left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants