Skip to content

Conversation

@devsko
Copy link
Contributor

@devsko devsko commented Sep 17, 2020

See #42158
comment and comment and comment

Test only

Test several scenarios in System.Text.Json where deserialization has to continue after the next chunk of data is available.

- Test continuation at every position inside the tested object
- Many member with primitive and nullable types
- One more level of nested object
- All combinations of class/struct for tested and nested object
- tested and nested object with parametrized ctor for some properties
Tweak the payload and expect `JsonException`
@devsko
Copy link
Contributor Author

devsko commented Sep 17, 2020

@layomia This is not yet ready for review. I hope to get done what I have in mind tomorrow.

@devsko devsko marked this pull request as ready for review September 17, 2020 21:39
@devsko
Copy link
Contributor Author

devsko commented Sep 17, 2020

Is there anything else about continuation / chunked buffer?

@ahsonkhan
Copy link
Contributor

ahsonkhan commented Sep 17, 2020

Is there anything else about continuation / chunked buffer?

Since you asked, how about continuation where the token being split isn't null but other types, like whitespace (\r\n), or true/false boolean, or some large string token, or a number?

In the Utf8JsonReader specific tests, a lot of those scenarios are covered, by building ROSequence with segments splits across a whole variety of locations within the JSON, and with partial data/state to test re-entrancy. But most of those are for relatively small payloads (due to test time), and the async Deserializer API for streams could benefit from that type of extensive coverage too :)

// TestCaseType is only used to give the json strings a descriptive name.
[Theory]
// Skipping large JSON since slicing them (O(n^2)) is too slow.
[MemberData(nameof(SmallTestCases))]
public static void TestJsonReaderUtf8SegmentSizeOne(bool compactData, TestCaseType type, string jsonString)
{
ReadPartialSegmentSizeOne(compactData, type, jsonString);
}
// TestCaseType is only used to give the json strings a descriptive name.
[Theory]
[MemberData(nameof(LargeTestCases))]
public static void TestJsonReaderLargeUtf8SegmentSizeOne(bool compactData, TestCaseType type, string jsonString)
{
// Skipping really large JSON on Browser to prevent OOM
if (PlatformDetection.IsBrowser && (type == TestCaseType.Json40KB || type == TestCaseType.Json400KB || type == TestCaseType.ProjectLockJson))
{
return;
}
ReadFullySegmentSizeOne(compactData, type, jsonString);
}
// TestCaseType is only used to give the json strings a descriptive name.
[Theory]
[OuterLoop]
[MemberData(nameof(LargeTestCases))]
public static void TestJsonReaderLargestUtf8SegmentSizeOne(bool compactData, TestCaseType type, string jsonString)
{
// Skipping really large JSON since slicing them (O(n^2)) is too slow.
if (type == TestCaseType.Json40KB || type == TestCaseType.Json400KB || type == TestCaseType.ProjectLockJson)
{
return;
}
ReadPartialSegmentSizeOne(compactData, type, jsonString);
}
private static void ReadPartialSegmentSizeOne(bool compactData, TestCaseType type, string jsonString)
{
// Remove all formatting/indendation
if (compactData)
{
jsonString = JsonTestHelper.GetCompactString(jsonString);
}
byte[] dataUtf8 = Encoding.UTF8.GetBytes(jsonString);
Stream stream = new MemoryStream(dataUtf8);
TextReader reader = new StreamReader(stream, Encoding.UTF8, false, 1024, true);
string expectedStr = JsonTestHelper.NewtonsoftReturnStringHelper(reader);
ReadOnlySequence<byte> sequence = JsonTestHelper.GetSequence(dataUtf8, 1);
for (int j = 0; j < dataUtf8.Length; j++)
{
var utf8JsonReader = new Utf8JsonReader(sequence.Slice(0, j), isFinalBlock: false, default);
byte[] resultSequence = JsonTestHelper.ReaderLoop(dataUtf8.Length, out int length, ref utf8JsonReader);
string actualStrSequence = Encoding.UTF8.GetString(resultSequence, 0, length);
long consumed = utf8JsonReader.BytesConsumed;
utf8JsonReader = new Utf8JsonReader(sequence.Slice(consumed), isFinalBlock: true, utf8JsonReader.CurrentState);
resultSequence = JsonTestHelper.ReaderLoop(dataUtf8.Length, out length, ref utf8JsonReader);
actualStrSequence += Encoding.UTF8.GetString(resultSequence, 0, length);
string message = $"Expected consumed: {dataUtf8.Length - consumed}, Actual consumed: {utf8JsonReader.BytesConsumed}, Index: {j}";
Assert.True(dataUtf8.Length - consumed == utf8JsonReader.BytesConsumed, message);
Assert.Equal(expectedStr, actualStrSequence);
}
}
private static void ReadFullySegmentSizeOne(bool compactData, TestCaseType type, string jsonString)
{
// Remove all formatting/indendation
if (compactData)
{
jsonString = JsonTestHelper.GetCompactString(jsonString);
}
byte[] dataUtf8 = Encoding.UTF8.GetBytes(jsonString);
Stream stream = new MemoryStream(dataUtf8);
TextReader reader = new StreamReader(stream, Encoding.UTF8, false, 1024, true);
string expectedStr = JsonTestHelper.NewtonsoftReturnStringHelper(reader);
ReadOnlySequence<byte> sequence = JsonTestHelper.GetSequence(dataUtf8, 1);
var utf8JsonReader = new Utf8JsonReader(sequence, isFinalBlock: true, default);
byte[] resultSequence = JsonTestHelper.ReaderLoop(dataUtf8.Length, out int length, ref utf8JsonReader);
string actualStrSequence = Encoding.UTF8.GetString(resultSequence, 0, length);
Assert.Equal(expectedStr, actualStrSequence);
}
[Theory]
[MemberData(nameof(SmallTestCases))]
public static void TestPartialJsonReaderMultiSegment(bool compactData, TestCaseType type, string jsonString)
{
_ = type;
// Remove all formatting/indendation
if (compactData)
{
jsonString = JsonTestHelper.GetCompactString(jsonString);
}
byte[] dataUtf8 = Encoding.UTF8.GetBytes(jsonString);
ReadOnlyMemory<byte> dataMemory = dataUtf8;
List<ReadOnlySequence<byte>> sequences = JsonTestHelper.GetSequences(dataMemory);
for (int i = 0; i < sequences.Count; i++)
{
ReadOnlySequence<byte> sequence = sequences[i];
var json = new Utf8JsonReader(sequence, isFinalBlock: true, default);
while (json.Read())
;
Assert.Equal(sequence.Length, json.BytesConsumed);
Assert.True(sequence.Slice(json.Position).IsEmpty);
}
for (int i = 0; i < sequences.Count; i++)
{
ReadOnlySequence<byte> sequence = sequences[i];
var json = new Utf8JsonReader(sequence);
while (json.Read())
;
Assert.Equal(sequence.Length, json.BytesConsumed);
Assert.True(sequence.Slice(json.Position).IsEmpty);
}
}
[Theory]
[OuterLoop]
[MemberData(nameof(SmallTestCases))]
public static void TestPartialJsonReaderSlicesMultiSegment(bool compactData, TestCaseType type, string jsonString)
{
_ = type;
// Remove all formatting/indendation
if (compactData)
{
jsonString = JsonTestHelper.GetCompactString(jsonString);
}
byte[] dataUtf8 = Encoding.UTF8.GetBytes(jsonString);
ReadOnlyMemory<byte> dataMemory = dataUtf8;
List<ReadOnlySequence<byte>> sequences = JsonTestHelper.GetSequences(dataMemory);
for (int i = 0; i < sequences.Count; i++)
{
ReadOnlySequence<byte> sequence = sequences[i];
for (int j = 0; j < dataUtf8.Length; j++)
{
var json = new Utf8JsonReader(sequence.Slice(0, j), isFinalBlock: false, default);
while (json.Read())
;
long consumed = json.BytesConsumed;
JsonReaderState jsonState = json.CurrentState;
byte[] consumedArray = sequence.Slice(0, consumed).ToArray();
Assert.Equal(consumedArray, sequence.Slice(0, json.Position).ToArray());
json = new Utf8JsonReader(sequence.Slice(consumed), isFinalBlock: true, jsonState);
while (json.Read())
;
Assert.Equal(dataUtf8.Length - consumed, json.BytesConsumed);
}
}
}

@devsko
Copy link
Contributor Author

devsko commented Sep 18, 2020

how about continuation where the token being split isn't null but other types, like whitespace (\r\n), or true/false boolean, or some large string token, or a number?

All tests here split the payload once on every single character. Thus all tokens are tested how they work when split into 2 chunks including all mentioned examples except whitespaces. I will add them by enabling WriteIndented

Added dictionary test
@devsko
Copy link
Contributor Author

devsko commented Sep 18, 2020

I'd say that's it. Thanks for your suggestions, help and reviews. Really appreciated. Feel free to change whatever you want - or wait 2 weeks. See you - peace

@stephentoub
Copy link
Member

@devsko, thanks for your efforts here. Are you still working on this?

Copy link
Contributor

@layomia layomia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - @devsko we can merge this once conflicts and #42393 (comment) are resolved.

@layomia layomia added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Oct 29, 2020
@layomia layomia self-assigned this Nov 2, 2020
@layomia layomia removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Nov 2, 2020
@layomia
Copy link
Contributor

layomia commented Nov 2, 2020

I pushed a commit to finish this PR.

@layomia layomia merged commit e691753 into dotnet:master Nov 2, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 7, 2020
@devsko devsko deleted the streamtest branch March 5, 2021 18:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants