Faster IndexOfVectorized#1231
Conversation
|
If you run the harness till the end (as opposed to stopping it after some tests pass), the harness will analyze the raw file and create another file with relatively nicely tallied results. |
|
I ran the benchmark with your changes. Nice improvement! |
|
I added some tests for different indexes. Here are the results: With your changes: Test Name Average Without your changes: Test Name Average Nice improvement. I will merge the changes. |
|
What's it like on the early numbers? (non-zero) compared to regular IndexOf? |
| public static int IndexOfVectorized(this ReadOnlySpan<byte> buffer, byte value) | ||
| public unsafe static int IndexOfVectorized(this ReadOnlySpan<byte> buffer, byte value) | ||
| { | ||
| Debug.Assert(s_longSize == 4 || s_longSize == 2); |
There was a problem hiding this comment.
This assert is removed in above (Span overload).
| var byteSize = s_byteSize; | ||
| fixed (byte* pHaystack = &buffer.DangerousGetPinnableReference()) | ||
| { | ||
| var haystack = pHaystack; |
There was a problem hiding this comment.
Could you share code between Span and ReadOnlySpan implementations after getting the pointer?
There was a problem hiding this comment.
Yes they are identical
| } | ||
|
|
||
| var byteSize = s_byteSize; | ||
| fixed (byte* pHaystack = &buffer.DangerousGetPinnableReference()) |
There was a problem hiding this comment.
haystack is a unorganized pile. Span is an ordered list. I think the parameter name needs to change :-)
|
Test Name Average IndexOfBench.SpanIndexOf(at: 30) 0.0168 IndexOfBench.SpanIndexOf(at: 16) 0.0114 IndexOfBench.SpanIndexOf(at: 8) 0.0077 IndexOfBench.SpanIndexOf(at: 4) 0.0057 IndexOfBench.SpanIndexOf(at: 2) 0.0045 IndexOfBench.SpanIndexOf(at: 1) 0.0045 IndexOfBench.SpanIndexOf(at: 0) 0.0041 |
|
I am going to merge this and then we can cleanup depending what goes or not into Span directly. |
from dotnet/corefx#16222
Not quite sure how to interpret the benchmark results (from corefxlab\scripts\PerfHarness) but they suggest is x4.5 - x6 faster than
.IndexOffor everything except length = 0Likely want to verify it by someone who understands how the benchmarks work; might all be the same length and 0 is higher as its jitting