Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Implement simple SIMD intrinsics for AVX/AVX2#16287

Merged
tannergooding merged 4 commits into
dotnet:masterfrom
fiigii:avxsimd
Feb 10, 2018
Merged

Implement simple SIMD intrinsics for AVX/AVX2#16287
tannergooding merged 4 commits into
dotnet:masterfrom
fiigii:avxsimd

Conversation

@fiigii
Copy link
Copy Markdown

@fiigii fiigii commented Feb 8, 2018

@fiigii
Copy link
Copy Markdown
Author

fiigii commented Feb 9, 2018

test Windows_NT x64 Checked jitincompletehwintrinsic
test Windows_NT x64 Checked jitx86hwintrinsicnoavx
test Windows_NT x64 Checked jitx86hwintrinsicnoavx2
test Windows_NT x64 Checked jitx86hwintrinsicnosimd
test Windows_NT x64 Checked jitnox86hwintrinsic

test Windows_NT x86 Checked jitincompletehwintrinsic
test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnoavx2
test Windows_NT x86 Checked jitx86hwintrinsicnosimd
test Windows_NT x86 Checked jitnox86hwintrinsic

test Ubuntu x64 Checked jitincompletehwintrinsic
test Ubuntu x64 Checked jitx86hwintrinsicnoavx
test Ubuntu x64 Checked jitx86hwintrinsicnoavx2
test Ubuntu x64 Checked jitx86hwintrinsicnosimd
test Ubuntu x64 Checked jitnox86hwintrinsic

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

@fiigii
Copy link
Copy Markdown
Author

fiigii commented Feb 9, 2018

@CarolEidt @tannergooding PTAL

I removed the test cases of Avx2.CompareEqual/CompareGreaterThan.(U)Int64 due to the SSE4.1/SSE4.2 encoding issue. Logged at https://github.com/dotnet/coreclr/issues/16296

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 9, 2018

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

("SimpleBinOpTest.template", new string[] { "Avx", "Avx", "AndNot", "Single", "Vector256", "32", "(float)(random.NextDouble())", "((~BitConverter.SingleToInt32Bits(left[0])) & BitConverter.SingleToInt32Bits(right[0])) != BitConverter.SingleToInt32Bits(result[0])", "((~BitConverter.SingleToInt32Bits(left[i])) & BitConverter.SingleToInt32Bits(right[i])) != BitConverter.SingleToInt32Bits(result[i])"}),
("SimpleTernOpTest.template",new string[] { "Avx", "Avx", "BlendVariable", "Double", "Vector256", "32", "(double)(random.NextDouble())", "(double)(((i % 2) == 0) ? -0.0E0 : 1.0E0)", "((BitConverter.DoubleToInt64Bits(thirdOp[0]) >> 63) & 1) == 1 ? BitConverter.DoubleToInt64Bits(secondOp[0]) != BitConverter.DoubleToInt64Bits(result[0]) : BitConverter.DoubleToInt64Bits(firstOp[0]) != BitConverter.DoubleToInt64Bits(result[0])", "((BitConverter.DoubleToInt64Bits(thirdOp[i]) >> 63) & 1) == 1 ? BitConverter.DoubleToInt64Bits(secondOp[i]) != BitConverter.DoubleToInt64Bits(result[i]) : BitConverter.DoubleToInt64Bits(firstOp[i]) != BitConverter.DoubleToInt64Bits(result[i])"}),
("SimpleTernOpTest.template",new string[] { "Avx", "Avx", "BlendVariable", "Single", "Vector256", "32", "(float)(random.NextDouble())", "(float)(((i % 2) == 0) ? -0.0E0 : 1.0E0)", "((BitConverter.SingleToInt32Bits(thirdOp[0]) >> 31) & 1) == 1 ? BitConverter.SingleToInt32Bits(secondOp[0]) != BitConverter.SingleToInt32Bits(result[0]) : BitConverter.SingleToInt32Bits(firstOp[0]) != BitConverter.SingleToInt32Bits(result[0])", "((BitConverter.SingleToInt32Bits(thirdOp[i]) >> 31) & 1) == 1 ? BitConverter.SingleToInt32Bits(secondOp[i]) != BitConverter.SingleToInt32Bits(result[i]) : BitConverter.SingleToInt32Bits(firstOp[i]) != BitConverter.SingleToInt32Bits(result[i])"}),
("SimpleUnOpTest.template", new string[] { "Avx", "Avx", "DuplicateEvenIndexed", "Double", "Vector256", "32", "(double)(random.NextDouble())", "BitConverter.DoubleToInt64Bits(firstOp[0]) != BitConverter.DoubleToInt64Bits(result[0])", "(i % 2 == 0) ? (BitConverter.DoubleToInt64Bits(firstOp[i]) != BitConverter.DoubleToInt64Bits(result[i])) : (BitConverter.DoubleToInt64Bits(firstOp[i - 1]) != BitConverter.DoubleToInt64Bits(result[i]))"}),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It would be useful to treat this code as a "table" and keep the individual columns aligned, where possible.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix. Thanks.

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Feb 9, 2018

test OSX10.12 x64 Checked Innerloop Build and Test

("SimpleBinOpTest.template", new string[] { "Avx", "Avx", "AndNot", "Double", "Vector256", "32", "(double)(random.NextDouble())", "((~BitConverter.DoubleToInt64Bits(left[0])) & BitConverter.DoubleToInt64Bits(right[0])) != BitConverter.DoubleToInt64Bits(result[0])", "((~BitConverter.DoubleToInt64Bits(left[i])) & BitConverter.DoubleToInt64Bits(right[i])) != BitConverter.DoubleToInt64Bits(result[i])"}),
("SimpleBinOpTest.template", new string[] { "Avx", "Avx", "AndNot", "Single", "Vector256", "32", "(float)(random.NextDouble())", "((~BitConverter.SingleToInt32Bits(left[0])) & BitConverter.SingleToInt32Bits(right[0])) != BitConverter.SingleToInt32Bits(result[0])", "((~BitConverter.SingleToInt32Bits(left[i])) & BitConverter.SingleToInt32Bits(right[i])) != BitConverter.SingleToInt32Bits(result[i])"}),
("SimpleTernOpTest.template",new string[] { "Avx", "Avx", "BlendVariable", "Double", "Vector256", "32", "(double)(random.NextDouble())", "(double)(((i % 2) == 0) ? -0.0E0 : 1.0E0)", "((BitConverter.DoubleToInt64Bits(thirdOp[0]) >> 63) & 1) == 1 ? BitConverter.DoubleToInt64Bits(secondOp[0]) != BitConverter.DoubleToInt64Bits(result[0]) : BitConverter.DoubleToInt64Bits(firstOp[0]) != BitConverter.DoubleToInt64Bits(result[0])", "((BitConverter.DoubleToInt64Bits(thirdOp[i]) >> 63) & 1) == 1 ? BitConverter.DoubleToInt64Bits(secondOp[i]) != BitConverter.DoubleToInt64Bits(result[i]) : BitConverter.DoubleToInt64Bits(firstOp[i]) != BitConverter.DoubleToInt64Bits(result[i])"}),
("SimpleTernOpTest.template",new string[] { "Avx", "Avx", "BlendVariable", "Single", "Vector256", "32", "(float)(random.NextDouble())", "(float)(((i % 2) == 0) ? -0.0E0 : 1.0E0)", "((BitConverter.SingleToInt32Bits(thirdOp[0]) >> 31) & 1) == 1 ? BitConverter.SingleToInt32Bits(secondOp[0]) != BitConverter.SingleToInt32Bits(result[0]) : BitConverter.SingleToInt32Bits(firstOp[0]) != BitConverter.SingleToInt32Bits(result[0])", "((BitConverter.SingleToInt32Bits(thirdOp[i]) >> 31) & 1) == 1 ? BitConverter.SingleToInt32Bits(secondOp[i]) != BitConverter.SingleToInt32Bits(result[i]) : BitConverter.SingleToInt32Bits(firstOp[i]) != BitConverter.SingleToInt32Bits(result[i])"}),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why -0.0E0 and -1.0E0 instead of just -0.0 and 1.0?


namespace JIT.HardwareIntrinsics.X86
{
public unsafe struct SimpleTernaryOpTest__DataTable<T> : IDisposable where T : struct
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had originally kept the unaligned data tables around because not all the tests had been moved over to use the templates (and therefore couldn't be easily switch to using he aligned data tables).

We might not need/want to add new unaligned data tables and instead should probably just add new properties to the aligned table to get an unaligned address (for the scenarios where we need to explicitly test it).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not need/want to add new unaligned data tables and instead should probably just add new properties to the aligned table to get an unaligned address (for the scenarios where we need to explicitly test it).

In which case it might make sense to remove the "aligned" from the name, as it would then just become the "data table"?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it would then just become the "data table"?

Agree.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. As said above, the primary reason the original ones were kept is because not all tests could be moved to the template quite yet (my PR originally just modified DataTable to support alignment and then later had to be split into two separate tables).

Copy link
Copy Markdown
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, overall. Just a couple of questions/nits.

Copy link
Copy Markdown

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


namespace JIT.HardwareIntrinsics.X86
{
public unsafe struct SimpleTernaryOpTest__DataTable<T> : IDisposable where T : struct
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not need/want to add new unaligned data tables and instead should probably just add new properties to the aligned table to get an unaligned address (for the scenarios where we need to explicitly test it).

In which case it might make sense to remove the "aligned" from the name, as it would then just become the "data table"?

@fiigii
Copy link
Copy Markdown
Author

fiigii commented Feb 9, 2018

test Windows_NT x64 Checked jitincompletehwintrinsic
test Windows_NT x64 Checked jitx86hwintrinsicnoavx
test Windows_NT x64 Checked jitx86hwintrinsicnoavx2
test Windows_NT x64 Checked jitx86hwintrinsicnosimd
test Windows_NT x64 Checked jitnox86hwintrinsic

test Windows_NT x86 Checked jitincompletehwintrinsic
test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnoavx2
test Windows_NT x86 Checked jitx86hwintrinsicnosimd
test Windows_NT x86 Checked jitnox86hwintrinsic

test Ubuntu x64 Checked jitincompletehwintrinsic
test Ubuntu x64 Checked jitx86hwintrinsicnoavx
test Ubuntu x64 Checked jitx86hwintrinsicnoavx2
test Ubuntu x64 Checked jitx86hwintrinsicnosimd
test Ubuntu x64 Checked jitnox86hwintrinsic

test OSX10.12 x64 Checked jitincompletehwintrinsic
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx
test OSX10.12 x64 Checked jitx86hwintrinsicnoavx2
test OSX10.12 x64 Checked jitx86hwintrinsicnosimd
test OSX10.12 x64 Checked jitnox86hwintrinsic

@fiigii
Copy link
Copy Markdown
Author

fiigii commented Feb 10, 2018

test Windows_NT x86 Checked jitx86hwintrinsicnoavx
test Windows_NT x86 Checked jitx86hwintrinsicnosimd

@fiigii
Copy link
Copy Markdown
Author

fiigii commented Feb 10, 2018

All feedback addressed and all tests passed.

@tannergooding Could you please merge this PR?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable containment test for AVX.LoadVector256/LoadAlignedVector256

5 participants