It's extremely common to split a string based on a single char or string separator, yet String.Split only offers overloads that accept an array of separators. The params char[] separator overload is particularly insidious as the params usage often results in a large number of unnecessary char[] heap allocations, unbeknownst to the developer.
Rationale and Usage
Stack Exchange ran into this (significant number of separator arrays in memory) and submitted a pull request to ASP.NET MVC to cache String.Split separator arrays inside the ASP.NET MVC codebase.
They ended up with the following:
namespace System.Web.Mvc
{
internal static class StringSplits
{
// note array contents not technically read-only; just... don't edit them!
internal static readonly char[]
Period = new[] { '.' },
Comma = new[] { ',' };
}
}
With uses of string.Split('.') and string.Split(',') replaced with string.Split(StringSplits.Period) and string.Split(StringSplits.Comma) to avoid the char[] allocations.
It'd be awesome if String.Split offered overloads that accepted a single separator to avoid this.
Proposed API
public partial class String
{
// Proposed methods
public string[] Split(char separator, StringSplitOptions options = StringSplitOptions.None);
public string[] Split(char separator, int count, StringSplitOptions options = StringSplitOptions.None);
public string[] Split(string separator, StringSplitOptions options = StringSplitOptions.None);
public string[] Split(string separator, int count, StringSplitOptions options = StringSplitOptions.None);
// Existing methods
public string[] Split(params char[] separator);
public string[] Split(char[] separator, int count);
public string[] Split(char[] separator, StringSplitOptions options);
public string[] Split(char[] separator, int count, StringSplitOptions options);
public string[] Split(string[] separator, StringSplitOptions options);
public string[] Split(string[] separator, int count, StringSplitOptions options);
}
Notes
We had wanted to add public string[] Split(string separator), but this breaks source compatibility with uses of Split(null), which is documented to split based on white space, because it makes the call ambiguous between Split(char[]) and Split(string).
I won't go into it here as it's really a separate feature request, but it would also be worth considering new Split methods that return a struct collection of StringSpan/StringSegment structs (in lieu of internal runtime span/substring magic) to avoid the resulting string[] allocation (and substring allocations, unless needed).
I'd be happy to contribute an implementation and tests.
It's extremely common to split a string based on a single
charorstringseparator, yetString.Splitonly offers overloads that accept an array of separators. Theparams char[] separatoroverload is particularly insidious as theparamsusage often results in a large number of unnecessarychar[]heap allocations, unbeknownst to the developer.Rationale and Usage
Stack Exchange ran into this (significant number of separator arrays in memory) and submitted a pull request to ASP.NET MVC to cache
String.Splitseparator arrays inside the ASP.NET MVC codebase.They ended up with the following:
With uses of
string.Split('.')andstring.Split(',')replaced withstring.Split(StringSplits.Period)andstring.Split(StringSplits.Comma)to avoid thechar[]allocations.It'd be awesome if
String.Splitoffered overloads that accepted a single separator to avoid this.Proposed API
Notes
We had wanted to add
public string[] Split(string separator), but this breaks source compatibility with uses ofSplit(null), which is documented to split based on white space, because it makes the call ambiguous betweenSplit(char[])andSplit(string).I won't go into it here as it's really a separate feature request, but it would also be worth considering new
Splitmethods that return astructcollection ofStringSpan/StringSegmentstructs (in lieu of internal runtime span/substring magic) to avoid the resultingstring[]allocation (and substring allocations, unless needed).I'd be happy to contribute an implementation and tests.