Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Add read ahead logic for streams.#38039

Merged
JeremyKuhne merged 4 commits intodotnet:masterfrom
JeremyKuhne:readahead
May 31, 2019
Merged

Add read ahead logic for streams.#38039
JeremyKuhne merged 4 commits intodotnet:masterfrom
JeremyKuhne:readahead

Conversation

@JeremyKuhne
Copy link
Copy Markdown
Member

@JeremyKuhne JeremyKuhne commented May 30, 2019

When reading a json object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream.

When reading a jaon object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary  if we haven't already drained the stream.
We need to track consumed separately so we can requeue the reader properly after skipping. Add more stream tests.
// If we haven't read in all the data we need to read ahead when reading a json object
// or array into a single .NET object so the JsonDocument has all of the needed data
// to parse.
options.ReadAhead = !isFinalBlock;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we only need to do this when we have a custom converter or JsonElement (including the overflow property bag). However, we can optimize that later.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra seeking is only needed when we don't have the final block and we're reading an object/array into a property value, which I think may not be particularly common? I think we'll have to be somewhat reactionary here when we see real user scenarios that hit this. Not sure how efficiently we can evaluate the serialized object for "simpleness".

/// <summary>
/// Bytes consumed in the current loop
/// </summary>
public long BytesConsumed;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something more descriptive since it is only for the read-ahead. Like ReadAheadBytesConsumed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always use it, even when we aren't reading ahead. We had to track it separately because of the Skip logic. I've started a discussion with @ahsonkhan about whether or not we can move this back to the reader as there is currently no way to peek ahead without separately tracking consumed.

InlineData(10),
InlineData(20),
InlineData(1024)]
public void ReadJsonElementFromStream(int defaultBufferSize)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you verify this actually causes the new code to be hit? The issue is that although the buffer size is specified, we grab from the arraypool which typically will have larger blocks (e.g. 4K).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does fail without this. The smallest ArrayPool bucket is 16 bytes, so what we see here is 16, 32, and 1024 in reality.

@JeremyKuhne
Copy link
Copy Markdown
Member Author

/azp run corefx-ci

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@JeremyKuhne
Copy link
Copy Markdown
Member Author

MacOS run is failing with #38001.

@JeremyKuhne
Copy link
Copy Markdown
Member Author

#38077 fixes the NETFX failure

Copy link
Copy Markdown

@ahsonkhan ahsonkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, LGTM. We should try to cover more edge case in tests.

Comment thread src/System.Text.Json/tests/Serialization/JsonElementTests.cs
Comment thread src/System.Text.Json/tests/Serialization/SpanTests.cs
Comment thread src/System.Text.Json/tests/Serialization/SpanTests.cs
@JeremyKuhne
Copy link
Copy Markdown
Member Author

/azp run corefx-ci

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@JeremyKuhne JeremyKuhne merged commit f6b010d into dotnet:master May 31, 2019
@JeremyKuhne JeremyKuhne deleted the readahead branch May 31, 2019 05:00
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Add read ahead logic for streams.

When reading a jaon object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary  if we haven't already drained the stream.

* Track state correctly

We need to track consumed separately so we can requeue the reader properly after skipping. Add more stream tests.

* Clarify comments and other feedback.

* Fix comment


Commit migrated from dotnet/corefx@f6b010d
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants