Improve Enum.TryParse performance. Fixes #70#81
Conversation
|
Thanks for the contribution. Since this code is shared with desktop we need to make sure we don't break compatibility with it. Cases where the behavior is different before/after the change are considered breaking changes and need to be mitigated. There are a couple of ways we mitigate breaking changes:
For this specific case, you can probably special case the situation where the enum backing type is smaller than the value that you need to assign to it and in that case have the old behavior. |
|
OK, I got it - we play safe here :) |
|
Implemented that change. It is now a bit slower with large numbers (still 25 times faster than original) and has roughly equal performance with small numbers. |
|
Thanks for the contribuition. I am going to run some validation and if everything is ok I'll merge this in. |
|
Note that bool is a valid enum underlying type. How does this change interact with that? e.g. Given: Does the behavior of At a glance, I suspect it would have thrown FormatException before, but return PeculiarEnum.A now. |
|
@nguerrera yeah, you're right. I think this particular case can be worked around with additional check. PS: Turns out there is similar issue with char. This is harder than I'd expected... |
|
Updated test to check if char-based enum works correctly. Enum.Parse("0", typeof(CharEnum))"0" was treated like a char and not like number, so the result must be Enum.ToObject(typeof(CharEnum), '0')and it was equal to Enum.ToObject(typeof(CharEnum), 0)Now new code basically mimics behavior of Convert.ChangeType approach, but without throwing. |
There was a problem hiding this comment.
Any particular reason for switching these statements?
|
Overall the change looks good. I left some comments in the PR. If nothing comes out of the test run I am doing I'll merge this in! |
666f370 to
340ee42
Compare
|
@AlexGhiondea OK, I cleaned up remaining things and squashed it to one commit. |
|
Thanks for all the work you did on this change. I have kicked off a different test run and should have results on Monday. The test run I did didn't show up any issues. However, I talked with @jkotas about this and, to be on the safe side, we should update the code so that in the default case of the switch statement we fall back to the slow behavior we had before. This would cover the bool/float/double case. Can you also check what happens when we have an enum backed by a float/double? I know you cannot create these in C#, but via IL or C++ they could still be generated and we need to make sure we can correctly handle them. Again, thank you for doing the due diligence around this fix. |
|
@AlexGhiondea Indeed, @jkotas is right - there is possibility of having enum backed by float/double/IntPtr/UIntPtr. coreclr/src/mscorlib/src/System/Enum.cs Line 749 in 9df15b9 But on the same time ToObject() method doesn't support any of those types (as opposed to bool and char - they're actually supported) so parsing it will fail anyway, see coreclr/src/mscorlib/src/System/Enum.cs Line 550 in 9df15b9 Actually, I was able to find regression with some uber-tricky enums that look like: enum UberTricky : double
{
1 = 0, 0 = 1
}
Enum.Parse(typeof(UberTricky), "0");It throws argument exception with original code and will parse fine now (as string and not by value). |
|
Okay, added one more fixup that will ensure correct behavior in all cases - it will throw ArgumentException with same message as before. |
|
@Alexx999 I would like to thank you for the work you put into this PR. We have updated our contribution guidelines to better call out the bar for accepting pull requests. You can find it here: https://github.com/dotnet/coreclr/wiki/Contribution-guidelines We are aware of the work you put into this PR and we truly appreciate it! But, unfortunately, since it is not a mainline scenario and has unknown risk, this PR does not meet the bar. We are sorry for not being able to accept it and hope you will continue making contributions! Thank you again for your contribution! |
|
It is quite sad to see a PR like this being discarded after all of the work put forward here. |
|
that's a bit of surprise - it just seemed like all possible cases are covered at last. but probably you are right - this case is not really common. |
|
@phrohdoh I agree it is unfortunate as well and I also appreciate the effort that @Alexx999 put into this and while we were having the conversation on the PR we were also having a conversation internally to try and figure out the appropriate bar, hopefully that is a little clearer now. In this case while it would be nice to take the reward isn't worth the potential risk. Thanks again @alex999 and I look forward to your future contributions. |
The idea here is that enum value is always integer, so it will surely fit into long if it's signed or into ulong if it's unsigned. Furthermore, string-to-enum code uses ulong to store both signed and unsigned results anyway, so there is no real need to make any difference between signed/unsigned other than handling negative numbers or cast to intermediate results.
Performance figures:
Since I was unable to run MeasureIt on CoreClr, not was I able to find System.dll to get Stopwatch this results was taken using DateTime.Now and are fairly approximate, but difference is so big that it is hard to make mistake here.
Measurements taken on i5-4670K CPU at 4.2GHz, Win 8.1, release build
Old:
New:
In worst case improvement are over 70 times, plus there is noticeable speedup of parsing numbers as such (something near 25% since we now do less operations)
Code I used to test this can be viewed as gist: https://gist.github.com/Alexx999/84640c2c31e4f5806cd1
Issues/Open questions:
I found one edge case with undefined behavior - more precisely, when you do supply value that overflows enum's backing data type. This case is not covered by tests in CoreFx, it is present in my test as "Case 13".
For example, imagine following code:
In this case you'll get false with old code (and new code will parse it fine).
I believe this is inconsistent with how enums work in general - you can do this:
and it will not fail. Even more than that - this behavior actually is covered by tests, as can be seen here:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime/tests/System/Enum.cs#L290
So is it okay to have such behavior changes or it must be changed to perfectly mimic old behavior?
Fixes #70