Skip to content

Support Pattern Matching ReadOnlySpan<char> on constant string#44388

Merged
cston merged 19 commits into
dotnet:features/patterns-span-charfrom
YairHalberstadt:feature/switch-on-span-char
Feb 8, 2022
Merged

Support Pattern Matching ReadOnlySpan<char> on constant string#44388
cston merged 19 commits into
dotnet:features/patterns-span-charfrom
YairHalberstadt:feature/switch-on-span-char

Conversation

@YairHalberstadt
Copy link
Copy Markdown
Contributor

@YairHalberstadt YairHalberstadt commented May 19, 2020

Switching ReadOnlySpan on a constant string is championed at dotnet/csharplang#1881 and marked as AnyTime.

It seems inconsistent to be able to switch on ReadOnlySpan but not use it as a pattern, so I've added support for that too.

The approach is to generate a call to MemoryExtensions.SequenceEqual(span, pattern.AsSpan()).

This delegates to the same underlying method as string.Equals does on .Net Core so should be very efficient.

For large switches, we use the same optimization as string, namely generate a hash function to switch on before doing the equality check. If required members for the hash function aren't there we fall back to a series of if else equality checks.

I've also added support for warning on e.g. span is "foo" and "bar".

Open Questions

  1. Should this also support pattern matching Span<char> on a constant string?
    A) Yes

Relates to test plan: #59191

@YairHalberstadt YairHalberstadt requested a review from a team as a code owner May 19, 2020 18:54
@jaredpar
Copy link
Copy Markdown
Member

We're going to need to create a feature branch for this, or possible just create features/compiler-next. This can't go into master as we're fairly locked down on the set of features we're delivering for C# 9 right now (not just compiler implementation that is our constraint, it's constraints on other teams as well).

Will discuss with team today if we want to use a feature branch or compiler-next for this.

@YairHalberstadt

This comment was marked as resolved.

@Happypig375
Copy link
Copy Markdown
Member

So excited to see progress on my own suggestion! My computer is too old and slow to handle a repository as large as Roslyn, taking at least half an hour to even build, so I am excited to see someone else helping out!

@jinujoseph jinujoseph added Area-Compilers Community The pull request was submitted by a contributor who is not a Microsoft employee. labels Jun 18, 2020
@YairHalberstadt

This comment was marked as resolved.

@jaredpar

This comment was marked as resolved.

@jaredpar jaredpar added this to the Compiler.Next milestone Aug 12, 2020
@jaredpar jaredpar self-assigned this Aug 12, 2020
@Happypig375
Copy link
Copy Markdown
Member

@YairHalberstadt Span<char> won't be supported?

@YairHalberstadt
Copy link
Copy Markdown
Contributor Author

@Happypig375
That's up to the language team to decide.

@RikkiGibson
Copy link
Copy Markdown
Member

Is there a proposal or speclet for the feature implemented by this PR?

@Happypig375
Copy link
Copy Markdown
Member

@RikkiGibson The proposal is dotnet/csharplang#1881.

@YairHalberstadt
Copy link
Copy Markdown
Contributor Author

YairHalberstadt commented Oct 9, 2020

I've now added support for Span<char>

@RikkiGibson RikkiGibson self-assigned this Oct 23, 2020
WellKnownType.System_Runtime_CompilerServices_NativeIntegerAttribute,
WellKnownType.System_Runtime_CompilerServices_IsExternalInit
WellKnownType.System_Runtime_CompilerServices_IsExternalInit,
WellKnownType.System_MemoryExtensions
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little wary of using MemoryExtensions here. This is adding a new type dependency to the compiler and one where we have some dependency on how the members are implemented. Why not just use the operator== on the Span<T> and ReadOnlySpan<T> types instead. That maintains our existing type dependency and we already depend on the implementation of several of the type members.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the runtime exposes these APIs to us from a class we already take a type dependency on, that might be preferable, but I'm not sure operator == on Span would do what we want. For patterns I think we want to compare the values within the span to the characters of the string constant, but this operator just checks equality of the length and underlying reference.

https://github.com/dotnet/runtime/blob/d603356dfbf09616e55bd99ed840aa20468a9e57/src/libraries/System.Private.CoreLib/src/System/Span.cs#L346-L347

Comment thread src/Compilers/CSharp/Portable/Binder/Binder_Symbols.cs
/// This is a layer on top of the Compilation version that generates a diagnostic if the special
/// member isn't found.
/// </summary>
internal Symbol GetWellKnownTypeMember(WellKnownMember member, DiagnosticBag diagnostics, SyntaxNode syntax)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tempted to call this TryGetWellKnownTypeMember too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for consistency with the already existing GetSpecialTypeMember and TryGetSpecialTypeMember

where TSymbol : Symbol
{
symbol = (TSymbol)compilation.GetWellKnownTypeMember(wellKnownMember);
if ((object)symbol == null)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ((object)symbol == null)
if (symbol is null)

@jaredpar
Copy link
Copy Markdown
Member

I think we need to get a spec together for this change. It's pretty small but we need to spec the behavior. I think the LDM notes from 10/7 are a good starting point. https://github.com/dotnet/csharplang/blob/master/meetings/2020/LDM-2020-10-07.md#readonlyspanchar-patterns

Particularly I think the interesting points we need to spec out and agree on are:

  1. How is null is handled. The PR handles it as an error which I tend to agree with but need to spec that out. The LDM notes say "constant string" and null is a valid constant string so need to get crisp on that.
  2. How the comparison is achieved. The PR takes the approach of using MemoryExtensions. Think should weigh that against just using the equality operators on the Span<T> types themselves. Again though this is somethnig we should spec out in detail.

@RikkiGibson

This comment was marked as resolved.

@azure-pipelines

This comment was marked as resolved.

@jaredpar

This comment was marked as resolved.

@RikkiGibson

This comment was marked as resolved.

/// This is a layer on top of the Compilation version that generates a diagnostic if the well known
/// member isn't found.
/// </summary>
internal Symbol? GetWellKnownTypeMember(WellKnownMember member, BindingDiagnosticBag diagnostics, SyntaxNode syntax)
Copy link
Copy Markdown
Contributor

@cston cston Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetWellKnownTypeMember

Is this different in behavior from the GetWellKnownTypeMember() overload at line 1645? #Pending

: null;
}

internal static bool TryGetWellKnownTypeMember<TSymbol>(CSharpCompilation compilation, WellKnownMember wellKnownMember, SyntaxNode syntax, BindingDiagnosticBag diagnostics, [MaybeNullWhen(false)] out TSymbol symbol)
Copy link
Copy Markdown
Contributor

@cston cston Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TryGetWellKnownTypeMember

Could this method be replaced with one of the existing GetWellKnownTypeMember() overloads, for instance the overload at line 1664? #Pending

/// </summary>
internal Symbol GetSpecialTypeMember(SpecialMember member, BindingDiagnosticBag diagnostics, SyntaxNode syntax)
{
Symbol memberSymbol;
Copy link
Copy Markdown
Contributor

@cston cston Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Symbol memberSymbol

Consider reverting the changes to this method if not needed. #Pending

? WellKnownMember.System_ReadOnlySpan_T__get_Length
: WellKnownMember.System_Span_T__get_Length) as MethodSymbol;
var spanCharLengthMethod = spanTLengthMethod?.AsMember((NamedTypeSymbol)keyType);
if (spanCharLengthMethod != null && !spanCharLengthMethod.HasUseSiteError)
Copy link
Copy Markdown
Contributor

@cston cston Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spanCharLengthMethod != null && !spanCharLengthMethod.HasUseSiteError

It seems surprising that we allow Span<T>.Length to be missing. (It looks like we allow missing String.Length for the existing string switch cases, and emit different code when comparing against "", although I'm not sure why.) I think it would be better to require Span<T>.Length and ReadOnlySpan<T>.Length in Binder where we check for other well-known members needed for the span cases, and remove the conditional code here. #Pending

LabelSymbol start = F.GenerateLabel("start");


// This method should be kept consistent with SynthesizedStringSwitchHashMethod.ConstructStringHash
Copy link
Copy Markdown
Contributor

@cston cston Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ConstructStringHash

ComputeStringHash? #Pending

if (MessageID.IDS_FeatureSpanCharConstantPattern.CheckFeatureAvailability(diagnostics, Compilation, node.Location))
{
// report missing member and use site diagnostics
if (inputType.IsReadOnlySpanChar())
Copy link
Copy Markdown
Contributor

@cston cston Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputType.IsReadOnlySpanChar()

Consider using conditional expression within the GetWellKnownTypeMember() call rather than if { } else { }, for simplicity and consistency with other similar uses of these members. #Pending

if (wasExpression)
{
var convertedType = convertedExpression.Type ?? inputType;
if (convertedType.SpecialType == SpecialType.System_String && inputType.IsSpanOrReadOnlySpanChar())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't obvious to me why this was the right change to make, versus doing a conversion from string to ROS<char> in ConvertPatternExpression().


Cci.IReference stringEqualityMethodRef = _module.Translate(equalityMethod, syntaxNode, _diagnostics);
Cci.IMethodReference stringEqualityMethodRef = null;
Cci.IMethodReference lengthRef = null;
Copy link
Copy Markdown
Member

@RikkiGibson RikkiGibson Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider using 'MethodRef' instead of 'Ref' for this and below locals for clarity/consistency with 'stringEqualityMethodRef' #Pending

var sequenceEqualsTMethod = _module.Compilation.GetWellKnownTypeMember(isReadOnlySpan
? WellKnownMember.System_MemoryExtensions__SequenceEqual_ReadOnlySpan_T
: WellKnownMember.System_MemoryExtensions__SequenceEqual_Span_T) as MethodSymbol;
var sequenceEqualsCharMethod = sequenceEqualsTMethod?.Construct(_module.Compilation.GetSpecialType(SpecialType.System_Char));
Copy link
Copy Markdown
Member

@RikkiGibson RikkiGibson Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand why we defend against 'null' here and for other 'GetWellKnownTypeMember' methods when we are supposed to issue error diagnostics during binding when these helpers are not present. #Pending

if (isSpanOrReadOnlySpan)
{
// The caller ensures that the key is not byref, and is not a stack local
if (keyArg.Local is { } local)
Copy link
Copy Markdown
Member

@RikkiGibson RikkiGibson Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this do the right thing in scenarios where the input span is a field of a ref struct? #Pending

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like LocalRewriter.PatternLocalRewriter.ShareTempsAndEvaluateInput() ensures the value for the switch is either a parameter or a local. Added SwitchSpanChar_RefStructField() test for this case.

}
}

internal sealed partial class SynthesizedSpanSwitchHashMethod : SynthesizedGlobalMethodSymbol
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that we made 'SynthesizedSpanSwitchHashMethod' and 'SynthesizedStringSwitchHashMethod' partial. It looks like that's just so the constructor can go over here. However, I don't think it's your responsibility to fix it.

}";
CreateCompilationWithSpanAndMemoryExtensions(source, parseOptions: TestOptions.RegularPreview)
.VerifyDiagnostics(
// (5,49): error CS0150: A constant value is expected
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it might be better to have an error here about the unconvertibility of null to type Span<char>.

using System;
class C
{
static bool M(Span<char> chars) => chars is null;
Copy link
Copy Markdown
Member

@RikkiGibson RikkiGibson Jan 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should ensure we test a scenario where a named const string with a null value is used in a pattern. #Pending

1,
(byte)SignatureTypeCode.GenericMethodParameter, (byte)0,

// System_MemoryExtensions__AsSpanString
Copy link
Copy Markdown
Member

@RikkiGibson RikkiGibson Jan 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 'AsSpan_String'? #Pending

WellKnownType.System_Runtime_CompilerServices_NativeIntegerAttribute,
WellKnownType.System_Runtime_CompilerServices_IsExternalInit
WellKnownType.System_Runtime_CompilerServices_IsExternalInit,
WellKnownType.System_MemoryExtensions
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the runtime exposes these APIs to us from a class we already take a type dependency on, that might be preferable, but I'm not sure operator == on Span would do what we want. For patterns I think we want to compare the values within the span to the characters of the string constant, but this operator just checks equality of the length and underlying reference.

https://github.com/dotnet/runtime/blob/d603356dfbf09616e55bd99ed840aa20468a9e57/src/libraries/System.Private.CoreLib/src/System/Span.cs#L346-L347

}

[Fact]
public void PatternMatchReadonlySpanCharOnConstantString()
Copy link
Copy Markdown
Contributor

@cston cston Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public void PatternMatchReadonlySpanCharOnConstantString()
public void PatternMatchReadOnlySpanCharOnConstantString()
``` #Pending

class C
{
static void Main()
{
Copy link
Copy Markdown
Contributor

@cston cston Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{

Consider adding Test("");. #Pending

class C
{
static void Main()
{
Copy link
Copy Markdown
Contributor

@cston cston Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{

Consider adding Test("");. #Pending

// (5,49): error CS8652: The feature 'pattern matching ReadOnly/Span<char> on constant string' is currently in Preview and *unsupported*. To use Preview features, use the 'preview' language version.
// static bool M(Span<char> chars) => chars is "";
Diagnostic(ErrorCode.ERR_FeatureInPreview, @"""""").WithArguments("pattern matching ReadOnly/Span<char> on constant string").WithLocation(5, 49));
}
Copy link
Copy Markdown
Contributor

@cston cston Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

Consider testing with TestOptions.Regular10 instead (and rename method). #Pending

// (5,57): error CS8652: The feature 'pattern matching ReadOnly/Span<char> on constant string' is currently in Preview and *unsupported*. To use Preview features, use the 'preview' language version.
// static bool M(ReadOnlySpan<char> chars) => chars is "";
Diagnostic(ErrorCode.ERR_FeatureInPreview, @"""""").WithArguments("pattern matching ReadOnly/Span<char> on constant string").WithLocation(5, 57));
}
Copy link
Copy Markdown
Contributor

@cston cston Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

Consider testing with TestOptions.Regular10 instead (and rename method). #Pending

class C
{
static void Main()
{
Copy link
Copy Markdown
Contributor

@cston cston Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{

Consider adding Test("".ToArray()); #Pending

/// This has a different diagnostic to <see cref="SwitchReadOnlySpanCharOnNull"/> because there is a valid implicit conversion from null to Span via Array,
/// but not to the ReadOnlySpan defined in <see cref="TestSources.Span"/> which has two implicit operators from reference types to ReadOnlySpan (for string and array).
/// When using the BCL, the diagnostic will be this one for both.
/// </summary>
Copy link
Copy Markdown
Contributor

@cston cston Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test with the BCL types rather than TestSources, to avoid surprises (due to different implicit conversions, etc.) when compiling projects that are targeting the BCL.

Please consider changing the new tests in this file that depend on definitions of System.Span<T> and System.MemoryExtensions to compile against CoreCLR, with [ConditionalFact].

[ConditionalFact(typeof(CoreClrOnly))]
public void SwitchSpanCharOnNull()
{
    var source = "...";
    CreateCompilation(
        source,
        targetFramework: TargetFramework.NetCoreApp,
        parseOptions: TestOptions.RegularPreview);
    // ...
}
``` #Pending

}
}";

public const string MemoryExtensions = @"
Copy link
Copy Markdown
Contributor

@cston cston Feb 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemoryExtensions

Consider removing if tests compile against BCL types instead. #Pending

@cston
Copy link
Copy Markdown
Contributor

cston commented Feb 2, 2022

@YairHalberstadt, sorry for the delay reviewing this PR. We've completed a couple of passes through the latest commit and expect to be able to review any subsequent commits promptly.

For planning purposes, do you have a timeframe in mind when you expect to address the remaining feedback? If it might be more than a few days, we could merge this PR into a feature branch and assign someone else to address remaining items in separate PRs to that branch. Let us know how you'd like to proceed, thanks.

@YairHalberstadt
Copy link
Copy Markdown
Contributor Author

@cston thanks for the review

I don't think I'll be able to address the feedback in the next couple of weeks unfortunately.

@cston cston changed the base branch from main to features/patterns-span-char February 2, 2022 20:07
@cston
Copy link
Copy Markdown
Contributor

cston commented Feb 8, 2022

We'll address the remaining comments in a separate PR to the feature branch.

Copy link
Copy Markdown
Contributor

@cston cston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming remaining issues are addressed in a follow-up PR.

Copy link
Copy Markdown
Member

@RikkiGibson RikkiGibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the assumption that pending comments will be addressed in a follow-up PR

@cston cston merged commit d36edf5 into dotnet:features/patterns-span-char Feb 8, 2022
@cston
Copy link
Copy Markdown
Contributor

cston commented Feb 8, 2022

Thanks @YairHalberstadt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area-Compilers Community The pull request was submitted by a contributor who is not a Microsoft employee.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants