Add read ahead logic for streams. by JeremyKuhne · Pull Request #38039 · dotnet/corefx

JeremyKuhne · 2019-05-30T00:00:16Z

When reading a json object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream.

When reading a jaon object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream.

We need to track consumed separately so we can requeue the reader properly after skipping. Add more stream tests.

steveharter · 2019-05-30T22:00:59Z

+            // If we haven't read in all the data we need to read ahead when reading a json object
+            // or array into a single .NET object so the JsonDocument has all of the needed data
+            // to parse.
+            options.ReadAhead = !isFinalBlock;


Ideally we only need to do this when we have a custom converter or JsonElement (including the overflow property bag). However, we can optimize that later.

The extra seeking is only needed when we don't have the final block and we're reading an object/array into a property value, which I think may not be particularly common? I think we'll have to be somewhat reactionary here when we see real user scenarios that hit this. Not sure how efficiently we can evaluate the serialized object for "simpleness".

steveharter · 2019-05-30T22:03:21Z

+        /// <summary>
+        /// Bytes consumed in the current loop
+        /// </summary>
+        public long BytesConsumed;


Something more descriptive since it is only for the read-ahead. Like ReadAheadBytesConsumed

We always use it, even when we aren't reading ahead. We had to track it separately because of the Skip logic. I've started a discussion with @ahsonkhan about whether or not we can move this back to the reader as there is currently no way to peek ahead without separately tracking consumed.

steveharter · 2019-05-30T22:10:11Z

+            InlineData(10),
+            InlineData(20),
+            InlineData(1024)]
+        public void ReadJsonElementFromStream(int defaultBufferSize)


Did you verify this actually causes the new code to be hit? The issue is that although the buffer size is specified, we grab from the arraypool which typically will have larger blocks (e.g. 4K).

Yes, it does fail without this. The smallest ArrayPool bucket is 16 bytes, so what we see here is 16, 32, and 1024 in reality.

JeremyKuhne · 2019-05-30T22:42:03Z

/azp run corefx-ci

azure-pipelines · 2019-05-30T22:42:23Z

Azure Pipelines successfully started running 1 pipeline(s).

JeremyKuhne · 2019-05-30T23:13:38Z

MacOS run is failing with #38001.

JeremyKuhne · 2019-05-30T23:31:11Z

#38077 fixes the NETFX failure

ahsonkhan

Otherwise, LGTM. We should try to cover more edge case in tests.

JeremyKuhne · 2019-05-31T02:54:31Z

/azp run corefx-ci

azure-pipelines · 2019-05-31T02:54:47Z

Azure Pipelines successfully started running 1 pipeline(s).

* Add read ahead logic for streams. When reading a jaon object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream. * Track state correctly We need to track consumed separately so we can requeue the reader properly after skipping. Add more stream tests. * Clarify comments and other feedback. * Fix comment Commit migrated from dotnet/corefx@f6b010d

JeremyKuhne added the area-System.Text.Json label May 30, 2019

JeremyKuhne added this to the 3.0 milestone May 30, 2019

JeremyKuhne requested review from ahsonkhan and steveharter May 30, 2019 00:00

ahsonkhan assigned JeremyKuhne May 30, 2019

JeremyKuhne force-pushed the readahead branch from 51623fa to 536f056 Compare May 30, 2019 21:42

JeremyKuhne added 2 commits May 30, 2019 14:50

Add read ahead logic for streams.

4f763ec

When reading a jaon object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream.

Track state correctly

982493d

We need to track consumed separately so we can requeue the reader properly after skipping. Add more stream tests.

JeremyKuhne force-pushed the readahead branch from 536f056 to 982493d Compare May 30, 2019 21:51

steveharter reviewed May 30, 2019

View reviewed changes

ahsonkhan approved these changes May 31, 2019

View reviewed changes

Clarify comments and other feedback.

4375aa3

ahsonkhan reviewed May 31, 2019

View reviewed changes

Comment thread src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.Stream.cs Outdated

Fix comment

a975ad4

JeremyKuhne merged commit f6b010d into dotnet:master May 31, 2019

JeremyKuhne deleted the readahead branch May 31, 2019 05:00

Conversation

JeremyKuhne commented May 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steveharter May 30, 2019

Choose a reason for hiding this comment

Uh oh!

JeremyKuhne May 30, 2019

Choose a reason for hiding this comment

Uh oh!

steveharter May 30, 2019

Choose a reason for hiding this comment

Uh oh!

JeremyKuhne May 30, 2019

Choose a reason for hiding this comment

Uh oh!

steveharter May 30, 2019

Choose a reason for hiding this comment

Uh oh!

JeremyKuhne May 30, 2019

Choose a reason for hiding this comment

Uh oh!

JeremyKuhne commented May 30, 2019

Uh oh!

azure-pipelines Bot commented May 30, 2019

Uh oh!

JeremyKuhne commented May 30, 2019

Uh oh!

JeremyKuhne commented May 30, 2019

Uh oh!

ahsonkhan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JeremyKuhne commented May 31, 2019

Uh oh!

azure-pipelines Bot commented May 31, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JeremyKuhne commented May 30, 2019 •

edited

Loading