Skip to content

Add String.Split overloads that take a single char and string separator #14483

@justinvp

Description

@justinvp

It's extremely common to split a string based on a single char or string separator, yet String.Split only offers overloads that accept an array of separators. The params char[] separator overload is particularly insidious as the params usage often results in a large number of unnecessary char[] heap allocations, unbeknownst to the developer.

Rationale and Usage

Stack Exchange ran into this (significant number of separator arrays in memory) and submitted a pull request to ASP.NET MVC to cache String.Split separator arrays inside the ASP.NET MVC codebase.

They ended up with the following:

namespace System.Web.Mvc
{
    internal static class StringSplits
    {
        // note array contents not technically read-only; just... don't edit them!
        internal static readonly char[]
            Period = new[] { '.' },
            Comma = new[] { ',' };
    }
}

With uses of string.Split('.') and string.Split(',') replaced with string.Split(StringSplits.Period) and string.Split(StringSplits.Comma) to avoid the char[] allocations.

It'd be awesome if String.Split offered overloads that accepted a single separator to avoid this.

Proposed API

public partial class String
{
    // Proposed methods
    public string[] Split(char separator, StringSplitOptions options = StringSplitOptions.None);
    public string[] Split(char separator, int count, StringSplitOptions options = StringSplitOptions.None);
    public string[] Split(string separator, StringSplitOptions options = StringSplitOptions.None);
    public string[] Split(string separator, int count, StringSplitOptions options = StringSplitOptions.None);

    // Existing methods
    public string[] Split(params char[] separator);
    public string[] Split(char[] separator, int count);
    public string[] Split(char[] separator, StringSplitOptions options);
    public string[] Split(char[] separator, int count, StringSplitOptions options);
    public string[] Split(string[] separator, StringSplitOptions options);
    public string[] Split(string[] separator, int count, StringSplitOptions options);
}

Notes

We had wanted to add public string[] Split(string separator), but this breaks source compatibility with uses of Split(null), which is documented to split based on white space, because it makes the call ambiguous between Split(char[]) and Split(string).

I won't go into it here as it's really a separate feature request, but it would also be worth considering new Split methods that return a struct collection of StringSpan/StringSegment structs (in lieu of internal runtime span/substring magic) to avoid the resulting string[] allocation (and substring allocations, unless needed).

I'd be happy to contribute an implementation and tests.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions