Create optimized TryRead<T>.#2419
Conversation
Move helpers to instance methods, split out slow paths and inline fast paths where needed. Add some perf tests.
|
I also removed endianness in the binary read. Doesn't seem worth taking a hit for. If we really want it we can add slower overloads. |
|
If you're going to pick an endianess to support, you'd want big endian since most networking protocols are big endian. |
| { | ||
| var unread = reader.UnreadSegment; | ||
| if (Utf8Parser.TryParse(unread, out value, out int consumed)) | ||
| const int MaxLength = 15; |
There was a problem hiding this comment.
What happens if the buffer contains leading zeros? Would MaxLength = 15 work in that case?
Also, where is 15 coming from?
There was a problem hiding this comment.
You have to be at the start- only the user can know what is relevant. 15 is the maximum size of the incoming string in the given format.
| <EmbeddedResource Include="System.Text.Primitives\SampleTexts\*.txt" /> | ||
| </ItemGroup> | ||
| <ItemGroup> | ||
| <Compile Include="..\System.Buffers.Primitives.Tests\BufferSegment.cs" Link="BufferSegment.cs" /> |
There was a problem hiding this comment.
Include a project reference to the entire System.Buffers.Primitives.Tests instead, similar to other project references?
https://github.com/dotnet/corefxlab/pull/2419/files#diff-2d9cd050a80433427e0c8791e9b28ed5R51
There was a problem hiding this comment.
Tried that first, it didn't work out. We link in those files in multiple test projects. I should probably move the helpers to a shared project.
There was a problem hiding this comment.
Tried that first, it didn't work out.
What ended up breaking?
Also, I tried to run the Reader_Binary tests locally, and i can't. I am getting the following errors for some reason. Have you come across this before @JeremyKuhne?
// Validating benchmarks:
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=99, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf16toUtf8 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=999, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf16toUtf8 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=9999, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf16toUtf8 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=99, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf32toUtf16 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=999, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf32toUtf16 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=9999, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf32toUtf16 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=99, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf8toUtf16 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=999, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf8toUtf16 has 7
Only 1 benchmark method in a group can have "Baseline = true" applied to it, group DefaultJob-[Length=9999, CodePointInfo=Syste(...)Point [43]] in class EncodeFromUtf8toUtf16 has 7
Any ideas? cc @adamsitnik
| { | ||
| public class Reader_Binary | ||
| { | ||
| static byte[] s_array; |
There was a problem hiding this comment.
nit: add access modifiers explicitly
There was a problem hiding this comment.
Yeah, sorry, copying an existing test. I'll fix that one too.
| public void PeekSpan() | ||
| { | ||
| const int Count = 1000; | ||
| const int Iterations = 100000; |
There was a problem hiding this comment.
Why are we manually setting up the iterations here? Is the objective here to measure reader.Peek and amortize the overhead of the setup of Span/BufferReader (which are ref structs and can't be saved as fields in global setup)?
cc @adamsitnik
There was a problem hiding this comment.
Yes, I couldn't hold the reader as you mention.
There was a problem hiding this comment.
@adamsitnik, do you plan to add support for ref struct fields in BDN?
There was a problem hiding this comment.
@ahsonkhan no, at least as of today. It would imply too many changes for our engine (today we derive a class from type with benchmarks, with structs it's impossible) +the stack only has too many limitations, example: it can't be used for [MemberData] because it requires to cast to object
public IEnumerator<object[]> GetEnumerator()
{
yield return new object[] { 1, 2, 3 };
yield return new object[] { -4, -6, -10 };
yield return new object[] { -2, 2, 0 };
yield return new object[] { int.MinValue, -1, int.MaxValue };
}we also can't use Func<Span<T>> because it can't be generic argument ;(
| [GlobalSetup] | ||
| public void Setup() | ||
| { | ||
| s_array = new byte[100000]; |
There was a problem hiding this comment.
nit: use underscores as digit separators
| { | ||
| public static ReadOnlySequence<byte> CreateSplitBuffer(byte[] buffer, int minSize, int maxSize) | ||
| { | ||
| if (buffer == null || buffer.Length == 0 || minSize <= 0 || maxSize <= 0) |
There was a problem hiding this comment.
Also if minSize > maxSize?
| var unread = reader.UnreadSegment; | ||
| if (Utf8Parser.TryParse(unread, out value, out int consumed)) | ||
| var unread = UnreadSegment; | ||
| if (Utf8Parser.TryParse(unread, out value, out int consumed) && consumed < unread.Length) |
There was a problem hiding this comment.
Why do we need to compare consumed with unread.Length here? Isn't parsing bool known to be of constant length (i.e. either 4 or 5)? Shouldn't we return true if TryParse returns true, regardless?
There was a problem hiding this comment.
For bool I can see dropping out. The others don't know if there is more valid data beyond a segment. I'll change this one and add a comment.
| { | ||
| var unread = reader.UnreadSegment; | ||
| if (Utf8Parser.TryParse(unread, out value, out int consumed)) | ||
| var unread = UnreadSegment; |
There was a problem hiding this comment.
nit: avoid using var here.
There was a problem hiding this comment.
I'll clean up the existing vars.
Can you add more detail/examples to "Inlined Struct wrapper .ctors are non-zero cost" https://github.com/dotnet/coreclr/issues/18542 |
When using can always add a non-branching byteswap after (using Custom protocols; IPC and file formats aren't necessarily big endian (e.g. why convert to big endian then back to little endian when 90% of the machines are little endian these days) |
|
on endianness - yah, I remember having lots of convos about whether it made sense to worry about endianness at this level, when you don't know the type (individual fields vs entire struct, etc) But: endianness is important, and despite what @davidfowl says; plenty use "little" too :) but that's OK; I can use SomeStruct val // TryRead etc
if(!BitConverter.IsLittleEndian) val.Foo = BinaryPrimitives.ReverseEndianness(val.Foo);which should inline / JIT-hoist nicely enough. It would be nice if there was a: var x = BinaryPrimitives.InterpretAsLittleEndian(val);
var y = BinaryPrimitives.InterpretAsBigEndian(val);but... I'll settle for what I have :) example impl, to show what I mean: public static int InterpretAsLittleEndian(int val)
=> BitConverter.IsLittleEndian ? val : ReverseEndianness(val);
public static int InterpretAsBigEndian(int val)
=> BitConverter.IsLittleEndian ? ReverseEndianness(val) : val;I can always add that kind of this in local utility methods, of course. |
|
There has been a solid push in low latency finance recently to move away from bigendian formats because it's just a waste of cycles to have to reverse everything so yes little endian is important. |
|
Ah, |
@benaadams I didn't look at the resulting IL, I just measured. Here are the numbers I saw (before breaking the slow path out of the main method). |
|
I've added some endian sample methods @mgravell, @davidfowl, @Drawaes, @benaadams Thoughts? |
|
I like it 👍 |
Don't think unsigned are needed? Would rather keep signature count lower.
|
Still taking feedback of course- merging in the current state. |
| Debug.Assert(UnreadSegment.Length < sizeof(T)); | ||
|
|
||
| // Not enough data in the current segment, try to peek for the data we need. | ||
| byte* buffer = stackalloc byte[sizeof(T)]; |
There was a problem hiding this comment.
Why isn't this directly using stackalloc Span?
There was a problem hiding this comment.
Because it won't let you pass it to Peek (for fear of being captured as the reader is a ref struct). Readonly methods should address some of this. I'm discussing with @jaredpar etc. about the possibilities of adding a feature that allows methods to take spans and specify that they won't stash the incoming span argument.
Move helpers to instance methods, split out slow paths and inline fast paths where needed.
Add some perf tests.
Overhead from implicit struct copies, method calls was significant. The best pattern I can come up with is to keep the fast (single span) reads in tight methods with callouts for aggregation. Changed the pattern of peek to return a direct span rather than a copy when it isn't necessary.
Before:
After
Roughly 3x / 10x improvement for binary reads. Parsing reads from 20% (split input) - 37% improvement in the benchmarks I added.