Implement GetComment API#35705
Conversation
|
@dotnet-bot test corefx-ci (Windows NETFX_x86_Release) |
|
@dotnet-bot test corefx-ci (Windows NETFX_x86_Release) please |
|
@WinCPP we're now on Azure pipelines check here for new commands https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/pullrequest-builds.md |
|
@MarcoRossignoli thank you for the guidance. For last several months, I was under the ice, à la Captain America, when this change happened; just resumed contributing. And I forgot that this PR depends on another one #33393 for successful builds, which I had commented as well in the description 🤦♂️, so I have to wait. But, good that I hit this issue and I learnt the new process. Thanks again! |
| } | ||
|
|
||
| /// <summary> | ||
| /// Reads value of the next JSON token for a comment from the source transcoded as a <see cref="string"/>, |
There was a problem hiding this comment.
Comment could be clearer: Maybe something like:
Reads the next JSON token value from the source as a comment, transcoded as a <see cref="string"/>, preserving the comment delimiters.
There was a problem hiding this comment.
Based on #35705 (comment), let's not preserve the comment delimiters.
There was a problem hiding this comment.
Would it make sense to strip the comment delimiters (\\, /* and */) and the line separators (\r, \r\n and \n) during parsing where the ValueSpan or ValueSequence for the JsonTokenType.Comment is prepared? In that case, should I do the changes in PR #33393 or should I do them in this PR? If it is to be done in this PR, then it would help if we have an early closure on #33393 so that this code can be rebased to include work for \r\n.
Edit 1
Hmm... I see that ConsumeString already excludes the quotes, so likewise it would be more logical to implement the functionality mentioned above...
Edit 2
Query: IIUC this line in JsonReaderHelper.CountNewLines needs to handle \r on its own as a line separator, and also ensure that \r\n are handled properly... ?
There was a problem hiding this comment.
Would it make sense to strip the comment delimiters (
\\,/*and*/) and the line separators (\r,\r\nand\n) during parsing where theValueSpanorValueSequencefor theJsonTokenType.Commentis prepared?
Yes, it would.
In that case, should I do the changes in PR #33393 or should I do them in this PR?
Let's stage it in smaller, clearer chunks, where the change is isolated. So, in this order:
- Utf8JsonReader.cs - Add support for single line comments ending on \r, \r\n #33393 - Add support for single line comments ending on \r, \r\n
- Open up a new PR - Change ValueSpan/ValueSequence to exclude the comment delimiters, effectively undoing parts of 4a2aa63 from Add JsonUtf8Reader (for ReadOnlySpan) along with unit tests #33216
- This PR (Implement GetComment API #35705) to expose new GetComment API
What do you think of this approach?
Query: IIUC this line in
JsonReaderHelper.CountNewLinesneeds to handle\ron its own as a line separator, and also ensure that\r\nare handled properly... ?
Unless this is blocking some of the work above, I would file a new issue for this and investigate/fix separately. Some questions we likely want to resolve first: What does Json.NET return for line number when comments end with just \r (for Windows vs Unix)? Do we double count line number \r\n or treat it as a single line increment?
cc @JamesNK
There was a problem hiding this comment.
Let's stage it in smaller, clearer chunks, where the change is isolated. So, in this order:
- Utf8JsonReader.cs - Add support for single line comments ending on \r, \r\n #33393 - Add support for single line comments ending on \r, \r\n
- Open up a new PR - Change ValueSpan/ValueSequence to exclude the comment delimiters, effectively undoing parts of 4a2aa63 from Add JsonUtf8Reader (for ReadOnlySpan) along with unit tests #33216
- This PR (Implement GetComment API #35705) to expose new GetComment API
What do you think of this approach?
Sounds like a plan! Thanks! So, I am working on #33393 on priority to finish step 1, and then pull in the code changes, once submitted into master, into step 2. So will put this (and new) PR a bit on hold. Kindly see updates for #33393.
There was a problem hiding this comment.
Do we double count line number \r\n or treat it as a single line increment?
Json.NET treats it as a single line increment
|
I think this PR is lite on tests. This is a Json.NET unit test for single line comments: It tests reading them in every place they could appear in JSON. |
|
@JamesNK, this PR is for new API to fetch comments and not really related to carriage return usage across JSON. However I pointed it out because I saw we are not incrementing the line count. I agree we need more tests, but could that be separate exercise? Or rather part of #33393 which is specifically for single line comments. Please refer to tests over there. |
|
I'm concerned in general about how robust the reader's support of comments are. @ahsonkhan would the reader successfully parse https://github.com/JamesNK/Newtonsoft.Json/blob/7217c484e9705b5e76585c8b7fcd489c8e021c23/Src/Newtonsoft.Json.Tests/JsonTextReaderTests/MiscTests.cs#L784-L919 ? |
We should try and see (and add a test like that to our test bed). It might uncover some bugs around corner cases. Let me take a look. |
|
Excluding some the things that Json.NET supports, the reader seems to parse the comments fine. |
|
What are next steps for this PR? Are more changes needed? |
|
Also as per discussion with @ahsonkhan in this comment, #35934 (step 2 in the comment) needs to go in so that I can rebase this PR and do some changes in comments related tests. |
|
@karelz @ahsonkhan
Next steps:
// current usage
string commentText = Encoding.UTF8.GetString(json.HasValueSequence ? json.ValueSequence.ToArray() : json.ValueSpan.ToArray());
// instead do the following
string commentText = json.GetComment(); |
|
Marking as blocked on #35934 then. |
|
@ahsonkhan I rebased this PR on master in which #35934 is now available. There were very few places in existing test cases where The build is however failing. Not sure what to do. Thanks! |
Do you mean locally or the CI leg That looks unrelated, likely something to do with the Ubuntu.1604.Arm64 machine/infra. For example: |
| // single line comments | ||
| foreach (var raw in rawComments) | ||
| { | ||
| var str = string.Format(raw, ""); |
There was a problem hiding this comment.
In here, each of the line in the rawComments works as format string. The {0} in each is replaced with "" and generates a single line comment since it is prefixed with // in the loop.
Whereas, in the loop at 3378 same is replaced with \n and the resultant string when enclosed in /* and */ makes a new line comment...
Of course, I could have let {0} remain as it is... but thought it would be cleaner...
| var state = new JsonReaderState(options: new JsonReaderOptions { CommentHandling = JsonCommentHandling.Allow }); | ||
| var reader = new Utf8JsonReader(dataUtf8, isFinalBlock: true, state); | ||
| bool commentFound = false; | ||
| while (reader.Read()) |
There was a problem hiding this comment.
This test could be simplified since it is only a single token within the json.
Assert.True(reader.Read());
Assert.Equal(JsonTokenType.Comment, reader.TokenType);
string comment = reader.GetComment();
Assert.Equal(expected, comment);
Assert.NotEqual(Regex.Unescape(expected), comment);* CoreFx dotnet/corefx#33347: Implement GetComment API * Generate References as per correct steps * Exclude delims & line separators, new unescape test * Revert consume comment changes, test case fixes * Partial revert of previous incorrect online merge * fixed another online merge conflict resolution slippage * Intermediate rebase from master * Reworked files * Reworked files 2 * Fix coding sytle as per review comments Commit migrated from dotnet/corefx@53aaaf3
Implement GetComment API with test cases.
Fixes #33347.
Please note that this depends on PR #33393 for successful handling of single line comment with different line terminators.
@ahsonkhan @terrajobst please review.