-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Remove unnecessary logic to determine E format when parsing floating... #42298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…point values on Utf8JsonReader and JsonDocument
|
This change makes sense.
There is a subtle difference. The only time we'd use 'e'/'E' standard format is if we require the number to be in scientific notation form, since Default, G, and e/E, all allow exponents, but e/E is the only one that requires them. Looking back even to 2.1 (when these APIs were originally introduced), this behavior hasn't changed: One of the benefits of keeping the number format field is that it acts as validation that the data being passed through the layers is correct (i.e. the token that the reader processed, is what is being passed to the parser). If while reading the JSON, we found the number was in scientific notation, then it is useful to pass 'e' to the However, given the extensive set of tests we already have, and since we benefit from reducing the state needed to be tracked which simplifies the logic, it is worth it to remove that field. // A regular number like 123 isn't valid, it needs to be in scientific notation, like 123e0
// This validation is not required when it comes to JSON.
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123"), out float val, out _, 'e')); // false
Console.WriteLine(val); // 0
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123"), out val, out _, 'E')); // false
Console.WriteLine(val); // 0
// Currently, casing doesn't matter, since 'e' and 'E' both allow 123e1 and 123E1.
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123e1"), out val, out _, 'e')); // true
Console.WriteLine(val); // 1230
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123E1"), out val, out _, 'e')); // true
Console.WriteLine(val); // 1230
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123e1"), out val, out _, 'E')); // true
Console.WriteLine(val); // 1230
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123E1"), out val, out _, 'E')); // true
Console.WriteLine(val); // 1230
// The behavior is identical, if we don't pass in a standard format, and use the default one.
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123"), out val, out _)); // true
Console.WriteLine(val); // 123
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123e1"), out val, out _)); // true
Console.WriteLine(val); // 1230
Console.WriteLine(Utf8Parser.TryParse(Encoding.UTF8.GetBytes("123E1"), out val, out _)); // true
Console.WriteLine(val); // 1230For some context, the use of this field was introduced in the following PR, to avoid having to loop through and check if the number span contained an 'e': dotnet/corefx#34386, which as you point out, wasn't necessary to begin with. |
ahsonkhan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
|
|
||
| if (Utf8Parser.TryParse(span, out decimal tmp, out int bytesConsumed, _numberFormat) | ||
| [MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
| internal bool TryGetDecimalCore(out decimal value, ReadOnlySpan<byte> span) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the out parameter is normally last or towards the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other TryGet*Core methods have this same parameter ordering and since this is internal I think is no much value on changing the ordering here, perhaps that can be made in a clean-up PR later.
... point values on Utf8JsonReader and JsonDocument.
The internal field
Utf8JsonReader._numberFormatis set to'e'only if the JSON number that was read was in scientific notation format and is then passed as thestandardFormatargument onUtf8Parser.TryParsemethod.There is no difference from parsing using the default format, except for the next condition, which is granted to always evaluate as false when called from
Utf8JsonReader:runtime/src/libraries/System.Private.CoreLib/src/System/Buffers/Text/Utf8Parser/Utf8Parser.Float.cs
Lines 104 to 108 in 33dba95
With that said, the
_numberFormatfield and all the related logic can be safely removed.This change was motivated by #39363 (comment).