Fixing Span IndexOf vectorization.#17500
Conversation
e6dc2f3 to
22e4c23
Compare
| { | ||
| int unaligned = (int)(byte*)Unsafe.AsPointer(ref searchSpace) & (Vector<byte>.Count - 1); | ||
| nLength = (IntPtr)(uint)unaligned; | ||
| nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) % Vector<byte>.Count); |
There was a problem hiding this comment.
Does the idiv get elided out in the asm? (otherwise its & the jit const (Vector<byte>.Count - 1) on the line above)
There was a problem hiding this comment.
Does the idiv get elided out in the asm?
What is idiv?
There was a problem hiding this comment.
integer divide; is the modulus (%) and can take up to 22 cpu cycles (vs bitwise and which is 1 cycle)
There was a problem hiding this comment.
Just conscious that its being compared against the byte by byte compare that's a lot of byte compares
There was a problem hiding this comment.
So, change:
nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) % Vector<byte>.Count); To:
nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) & (Vector<byte>.Count - 1)); Is that correct?
There was a problem hiding this comment.
Does it make sense to store Vector.Count in a local?
There was a problem hiding this comment.
No, when IsHardwareAccelerated == true it the jit directly replaces it with a const
There was a problem hiding this comment.
Yes on the & change as Vector<byte>.Count is always a power of 2 so -1 will all the low bits and will be the same as the %
| goto NotFound; | ||
| } | ||
| nLength = (IntPtr)(uint)(length - Vector<byte>.Count); | ||
| nLength = (IntPtr)(uint)(length - (uint)index); |
There was a problem hiding this comment.
Could be:
nLength = (IntPtr)(uint)((length - (uint)index) & ~(Vector<byte>.Count - 1));
Then you don't need the second calc in the loop? And can compare against index?
| if (Vector<byte>.Zero.Equals(vMatches)) | ||
| { | ||
| index += Vector<byte>.Count; | ||
| nLength -= Vector<byte>.Count; |
What about |
Should be |
|
@benaadams, do the changes make sense now (in line with what you had intended)? |
| { | ||
| int unaligned = (int)(byte*)Unsafe.AsPointer(ref searchSpace) & (Vector<byte>.Count - 1); | ||
| nLength = (IntPtr)(uint)unaligned; | ||
| nLength = (IntPtr)(uint)((Vector<byte>.Count - unaligned) & (Vector<byte>.Count - 1)); |
There was a problem hiding this comment.
Actually, unaligned will always be < Vector<byte>.Count so (from mask on line above) the & part isn't needed?
There was a problem hiding this comment.
What if unaligned == 0?
If we don't do the &, nLength would equal Vector<byte>.Count rather than 0.
|
LGTM; thank you for fixing this |
|
@dotnet-bot test Innerloop OSX10.12 Release x64 Build and Test |
cc @benaadams
Please take a look. When trying to reason about different test scenarios with @shiftylogic, these changes make more sense. Let me know if you notice any performance pitfalls or correctness issues.
Also cc @jkotas