Align ROS.Split handling of zero-length target strings with String.Split#37742
Align ROS.Split handling of zero-length target strings with String.Split#37742layomia wants to merge 2 commits intodotnet:masterfrom
Conversation
| public static void DefaultSpanSplitEnumeratorBehavior() | ||
| { | ||
| var charSpanEnumerator = new SpanSplitEnumerator<string>(); | ||
| var charSpanEnumerator = new SpanSplitEnumerator<char>(); |
There was a problem hiding this comment.
| var charSpanEnumerator = new SpanSplitEnumerator<char>(); | |
| ReadOnlySpan<char> charSpanEnumerator = new SpanSplitEnumerator<char>(); |
To align with the other changes from var to ReadOnlySpan<T>.
Same below, but for some reason cannot add a suggestion, as it is not part of the diff.
| _startCurrent = _startNext; | ||
|
|
||
| int separatorIndex = _splitOnSingleToken ? slice.IndexOf(_separator) : slice.IndexOf(_separators); | ||
| int separatorIndex; |
There was a problem hiding this comment.
Whats the motivation for this change? To improve runtime perf of Split(null) to make it not O(n^2)?
Oh, nevermind, i did not realize string.Split("") just returns the original string.
|
To be honest we should also consider having these methods return a type other than the generic
All of these are likely to have different implementations. But if we have a single enumerator type responsible for handling every one of these cases, then we need to complicate the enumerator type itself, including increasing the size of the struct to handle all possible cases and including a gigantic switch statement as part of the |
|
I fully agree, many of these intricacies solely pertain to a char version of SpanSplitEnumerator, so having a separate one would probably make things easier going forward. The initial #295 PR was based on a multi Enumerator implementation (though for splitting on sequence/single element, not for string/non-string). I feel like it's a bit dangerous that is currently merged into master given there are still many questions to be answered and potential amendments to the implementation. I'm not sure what guarantees dotnet provides for breaking changes between previews. |
There aren't many guarantees made between previews, but I agree there are too many unanswered questions regarding the current implementation. I'll ping some folks offline to get their thoughts. |
|
I feel bad for @davidfowl though, he seemed quite excited about this finally being merged 😅 |
|
@GrabYourPitchforks I finished up a fix for #37746, should I still submit a PR or wait for a decision to be made regarding fundamental changes the the Enumerator? |
|
Hi @bbartels, that's awesome that you've made such quick progress on it! I spoke with some others on the team and we think it's best that we run this through review again, backing out the original PR in the meantime. This should not be seen as a negative reflection on the code. Sometimes the unfortunate reality is that we don't notice these things until they're checked in and we actually try to use them. :( And now that we have your code to work with, we'll be more informed during the review process. I suspect we'll be able to use the majority of this as a solid foundation for creating the final API surface and behavior. Please remain involved with this API if you're still interested! |
|
I completely understand, I'm purely motivated by making .NET better and learning in the process! I am honestly happy that it's not being rushed and will be sent through review again. I feel like it's much more important to have a well thought out type in the runtime, than one where considerations have not been properly discussed! Definitely looking forward to staying involved; I'm off of Uni at the moment, so have more time than I know what to do with! Looking forward to the review! |
|
Closing based on above discussions. |
| } | ||
|
|
||
| [Theory] | ||
| [InlineData("a,b,c", null, new[] { "a,b,c" })] |
|
😢 |
Follow up of #295.
First commit is clean-up. Review can start from second commit.
cc @bbartels