Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Optimize ReadOnlySequence.First for the common case#33000

Merged
ahsonkhan merged 5 commits intodotnet:masterfrom
ahsonkhan:OptimizeSequenceFirst
Dec 22, 2018
Merged

Optimize ReadOnlySequence.First for the common case#33000
ahsonkhan merged 5 commits intodotnet:masterfrom
ahsonkhan:OptimizeSequenceFirst

Conversation

@ahsonkhan
Copy link
Copy Markdown

@ahsonkhan ahsonkhan commented Oct 24, 2018

Inspired from dotnet/coreclr#20386

Does it make sense to go down this path?

The bulk of the savings is coming from moving the MemoryManager code path out to a separate method to make the method that is getting inlined smaller. The remainder is coming from using unsafe casts and using a MemoryManagerHolder class (which means more allocations for that particular case).

Results from this benchmark:

public class Perf_ReadOnlySequence_First

image

Out-dated results

image

ThrowHelper.ThrowInvalidOperationException_EndPositionNotReached();

if (endIndex >= 0) // SequenceType.MultiSegment
if (typeof(T) == typeof(char) && startObject.GetType() == typeof(string))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing I was wondering was whether or not it would measure to have an extension method of GetFirstBuffer() public and have an explicit overload for byte? Or possibly have one that also returns the next position as well? E.g. anything that could improve the construction of a reader.

Comment thread src/System.Memory/src/System/Buffers/ReadOnlySequence.cs Outdated
@JeremyKuhne
Copy link
Copy Markdown
Member

@ahsonkhan, I'm going to explore another approach today as well. I'm going to explore adding more straight-forward state to the struct. I think it may pan out, especially given that there are free gaps on x64 given alignment requirements.

@jkotas- can you correct me if I'm wrong? Given that this is made up of two SequencePosition structs (which are object, int) it should occupy a full 32 bytes on x64, right (with 8 bytes unused)? If I stick another int in the struct between the two it should fit in?

The cost of accessing Memory from ReadOnlySequence is a major part of the overhead in trying to make a reader around it. From what @ahsonkhan is seeing here and what I've observed working on the reader it looks like most of this cost is the logic to extract out what the ROS contains. It shows up as 20-30% of the cost in some benchmarks, like parsing the request line on an http request. (These are things that impact benchmarking site metrics.)

@jkotas
Copy link
Copy Markdown
Member

jkotas commented Oct 24, 2018

If I stick another int in the struct between the two it should fit in?

You can try it. I do not know on top of my head what happens here.

@JeremyKuhne
Copy link
Copy Markdown
Member

You can try it. I do not know on top of my head what happens here.

Unfortunately it appears that you can't take advantage of the empty space. SequencePosition does take 16 bytes on x64 (8 + 4 + 4 padding). That means ReadOnlySequence is 32 bytes, but only uses 24. It appears that [StructLayout] is always ignored for non-blittable structs.

The best starting option (for my investigations) appears to be adding another int to SequencePosition. That has zero impact on x64 sizes for SequencePosition and ReadOnlySequence. For x86 it grows the size of SequencePosition from 8 to 12 bytes and the SequencePosition from 16 to 24.

@ahsonkhan ahsonkhan changed the title [WIP] Optimize ReadOnlySequence.First for the common case Optimize ReadOnlySequence.First for the common case Oct 24, 2018
@stephentoub
Copy link
Copy Markdown
Member

@ahsonkhan, what's the plan here? Close? Review further? Thanks.

@karelz
Copy link
Copy Markdown
Member

karelz commented Dec 19, 2018

@ahsonkhan ping?

@ahsonkhan
Copy link
Copy Markdown
Author

I resolved the merge conflict and the PR can now be merged, unless there is other feedback.

if (startIndex >= 0)
// A == 0 && B == 0 means SequenceType.MultiSegment
// Equivalent to startIndex >= 0 && endIndex >= 0
if ((startIndex >> 31) + (endIndex >> 31) == 0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to play cryptic bit tricks, this can be written as (startIndex | endIndex) >= 0 to save a few more instructions.

@karelz karelz added this to the 3.0 milestone Dec 21, 2018
@ahsonkhan ahsonkhan merged commit a7b5f8c into dotnet:master Dec 22, 2018
@ahsonkhan ahsonkhan deleted the OptimizeSequenceFirst branch December 22, 2018 00:00
@knocte
Copy link
Copy Markdown

knocte commented Dec 28, 2018

Sorry to jump in, but I couldn't avoid seeing a commit in this PR that is called Revert back to type-safe casts, and I have to say: it might have fixed an InvalidCastException issue I'm experiencing maybe? See https://github.com/dotnet/corefx/issues/32711 . Any ETA for a release with this? (However I still see more unsafe casts in the last version of ReadOnlySequence.Helpers.cs, see #L109) Merry xmas btw!

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
)

* Optimize ReadOnlySequence.First for the common case

* Add back bit mask flags

* Revert back to type-safe casts

* Address PR feedback.


Commit migrated from dotnet/corefx@a7b5f8c
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants