Add read ahead logic for streams.#38039
Conversation
When reading a jaon object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream.
We need to track consumed separately so we can requeue the reader properly after skipping. Add more stream tests.
| // If we haven't read in all the data we need to read ahead when reading a json object | ||
| // or array into a single .NET object so the JsonDocument has all of the needed data | ||
| // to parse. | ||
| options.ReadAhead = !isFinalBlock; |
There was a problem hiding this comment.
Ideally we only need to do this when we have a custom converter or JsonElement (including the overflow property bag). However, we can optimize that later.
There was a problem hiding this comment.
The extra seeking is only needed when we don't have the final block and we're reading an object/array into a property value, which I think may not be particularly common? I think we'll have to be somewhat reactionary here when we see real user scenarios that hit this. Not sure how efficiently we can evaluate the serialized object for "simpleness".
| /// <summary> | ||
| /// Bytes consumed in the current loop | ||
| /// </summary> | ||
| public long BytesConsumed; |
There was a problem hiding this comment.
Something more descriptive since it is only for the read-ahead. Like ReadAheadBytesConsumed
There was a problem hiding this comment.
We always use it, even when we aren't reading ahead. We had to track it separately because of the Skip logic. I've started a discussion with @ahsonkhan about whether or not we can move this back to the reader as there is currently no way to peek ahead without separately tracking consumed.
| InlineData(10), | ||
| InlineData(20), | ||
| InlineData(1024)] | ||
| public void ReadJsonElementFromStream(int defaultBufferSize) |
There was a problem hiding this comment.
Did you verify this actually causes the new code to be hit? The issue is that although the buffer size is specified, we grab from the arraypool which typically will have larger blocks (e.g. 4K).
There was a problem hiding this comment.
Yes, it does fail without this. The smallest ArrayPool bucket is 16 bytes, so what we see here is 16, 32, and 1024 in reality.
|
/azp run corefx-ci |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
MacOS run is failing with #38001. |
|
#38077 fixes the NETFX failure |
ahsonkhan
left a comment
There was a problem hiding this comment.
Otherwise, LGTM. We should try to cover more edge case in tests.
|
/azp run corefx-ci |
|
Azure Pipelines successfully started running 1 pipeline(s). |
* Add read ahead logic for streams. When reading a jaon object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream. * Track state correctly We need to track consumed separately so we can requeue the reader properly after skipping. Add more stream tests. * Clarify comments and other feedback. * Fix comment Commit migrated from dotnet/corefx@f6b010d
When reading a json object or array from a stream into an object we need to TrySkip to ensure that we have all the needed data for the JsonDocument to create a JsonElement. This is only necessary if we haven't already drained the stream.