Support Pattern Matching ReadOnlySpan<char> on constant string#44388
Conversation
|
We're going to need to create a feature branch for this, or possible just create Will discuss with team today if we want to use a feature branch or compiler-next for this. |
… are impossible to match ,or match all values
This comment was marked as resolved.
This comment was marked as resolved.
|
So excited to see progress on my own suggestion! My computer is too old and slow to handle a repository as large as Roslyn, taking at least half an hour to even build, so I am excited to see someone else helping out! |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
@YairHalberstadt |
|
@Happypig375 |
|
Is there a proposal or speclet for the feature implemented by this PR? |
|
@RikkiGibson The proposal is dotnet/csharplang#1881. |
|
I've now added support for |
| WellKnownType.System_Runtime_CompilerServices_NativeIntegerAttribute, | ||
| WellKnownType.System_Runtime_CompilerServices_IsExternalInit | ||
| WellKnownType.System_Runtime_CompilerServices_IsExternalInit, | ||
| WellKnownType.System_MemoryExtensions |
There was a problem hiding this comment.
I'm a little wary of using MemoryExtensions here. This is adding a new type dependency to the compiler and one where we have some dependency on how the members are implemented. Why not just use the operator== on the Span<T> and ReadOnlySpan<T> types instead. That maintains our existing type dependency and we already depend on the implementation of several of the type members.
There was a problem hiding this comment.
If the runtime exposes these APIs to us from a class we already take a type dependency on, that might be preferable, but I'm not sure operator == on Span would do what we want. For patterns I think we want to compare the values within the span to the characters of the string constant, but this operator just checks equality of the length and underlying reference.
| /// This is a layer on top of the Compilation version that generates a diagnostic if the special | ||
| /// member isn't found. | ||
| /// </summary> | ||
| internal Symbol GetWellKnownTypeMember(WellKnownMember member, DiagnosticBag diagnostics, SyntaxNode syntax) |
There was a problem hiding this comment.
Tempted to call this TryGetWellKnownTypeMember too.
There was a problem hiding this comment.
This is for consistency with the already existing GetSpecialTypeMember and TryGetSpecialTypeMember
| where TSymbol : Symbol | ||
| { | ||
| symbol = (TSymbol)compilation.GetWellKnownTypeMember(wellKnownMember); | ||
| if ((object)symbol == null) |
There was a problem hiding this comment.
| if ((object)symbol == null) | |
| if (symbol is null) |
|
I think we need to get a spec together for this change. It's pretty small but we need to spec the behavior. I think the LDM notes from 10/7 are a good starting point. https://github.com/dotnet/csharplang/blob/master/meetings/2020/LDM-2020-10-07.md#readonlyspanchar-patterns Particularly I think the interesting points we need to spec out and agree on are:
|
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
| /// This is a layer on top of the Compilation version that generates a diagnostic if the well known | ||
| /// member isn't found. | ||
| /// </summary> | ||
| internal Symbol? GetWellKnownTypeMember(WellKnownMember member, BindingDiagnosticBag diagnostics, SyntaxNode syntax) |
| : null; | ||
| } | ||
|
|
||
| internal static bool TryGetWellKnownTypeMember<TSymbol>(CSharpCompilation compilation, WellKnownMember wellKnownMember, SyntaxNode syntax, BindingDiagnosticBag diagnostics, [MaybeNullWhen(false)] out TSymbol symbol) |
| /// </summary> | ||
| internal Symbol GetSpecialTypeMember(SpecialMember member, BindingDiagnosticBag diagnostics, SyntaxNode syntax) | ||
| { | ||
| Symbol memberSymbol; |
| ? WellKnownMember.System_ReadOnlySpan_T__get_Length | ||
| : WellKnownMember.System_Span_T__get_Length) as MethodSymbol; | ||
| var spanCharLengthMethod = spanTLengthMethod?.AsMember((NamedTypeSymbol)keyType); | ||
| if (spanCharLengthMethod != null && !spanCharLengthMethod.HasUseSiteError) |
There was a problem hiding this comment.
spanCharLengthMethod != null && !spanCharLengthMethod.HasUseSiteError
It seems surprising that we allow Span<T>.Length to be missing. (It looks like we allow missing String.Length for the existing string switch cases, and emit different code when comparing against "", although I'm not sure why.) I think it would be better to require Span<T>.Length and ReadOnlySpan<T>.Length in Binder where we check for other well-known members needed for the span cases, and remove the conditional code here. #Pending
| LabelSymbol start = F.GenerateLabel("start"); | ||
|
|
||
|
|
||
| // This method should be kept consistent with SynthesizedStringSwitchHashMethod.ConstructStringHash |
| if (MessageID.IDS_FeatureSpanCharConstantPattern.CheckFeatureAvailability(diagnostics, Compilation, node.Location)) | ||
| { | ||
| // report missing member and use site diagnostics | ||
| if (inputType.IsReadOnlySpanChar()) |
| if (wasExpression) | ||
| { | ||
| var convertedType = convertedExpression.Type ?? inputType; | ||
| if (convertedType.SpecialType == SpecialType.System_String && inputType.IsSpanOrReadOnlySpanChar()) |
There was a problem hiding this comment.
It wasn't obvious to me why this was the right change to make, versus doing a conversion from string to ROS<char> in ConvertPatternExpression().
|
|
||
| Cci.IReference stringEqualityMethodRef = _module.Translate(equalityMethod, syntaxNode, _diagnostics); | ||
| Cci.IMethodReference stringEqualityMethodRef = null; | ||
| Cci.IMethodReference lengthRef = null; |
There was a problem hiding this comment.
consider using 'MethodRef' instead of 'Ref' for this and below locals for clarity/consistency with 'stringEqualityMethodRef' #Pending
| var sequenceEqualsTMethod = _module.Compilation.GetWellKnownTypeMember(isReadOnlySpan | ||
| ? WellKnownMember.System_MemoryExtensions__SequenceEqual_ReadOnlySpan_T | ||
| : WellKnownMember.System_MemoryExtensions__SequenceEqual_Span_T) as MethodSymbol; | ||
| var sequenceEqualsCharMethod = sequenceEqualsTMethod?.Construct(_module.Compilation.GetSpecialType(SpecialType.System_Char)); |
There was a problem hiding this comment.
I didn't understand why we defend against 'null' here and for other 'GetWellKnownTypeMember' methods when we are supposed to issue error diagnostics during binding when these helpers are not present. #Pending
| if (isSpanOrReadOnlySpan) | ||
| { | ||
| // The caller ensures that the key is not byref, and is not a stack local | ||
| if (keyArg.Local is { } local) |
There was a problem hiding this comment.
Does this do the right thing in scenarios where the input span is a field of a ref struct? #Pending
There was a problem hiding this comment.
It looks like LocalRewriter.PatternLocalRewriter.ShareTempsAndEvaluateInput() ensures the value for the switch is either a parameter or a local. Added SwitchSpanChar_RefStructField() test for this case.
| } | ||
| } | ||
|
|
||
| internal sealed partial class SynthesizedSpanSwitchHashMethod : SynthesizedGlobalMethodSymbol |
There was a problem hiding this comment.
I don't like that we made 'SynthesizedSpanSwitchHashMethod' and 'SynthesizedStringSwitchHashMethod' partial. It looks like that's just so the constructor can go over here. However, I don't think it's your responsibility to fix it.
| }"; | ||
| CreateCompilationWithSpanAndMemoryExtensions(source, parseOptions: TestOptions.RegularPreview) | ||
| .VerifyDiagnostics( | ||
| // (5,49): error CS0150: A constant value is expected |
There was a problem hiding this comment.
It seems like it might be better to have an error here about the unconvertibility of null to type Span<char>.
| using System; | ||
| class C | ||
| { | ||
| static bool M(Span<char> chars) => chars is null; |
There was a problem hiding this comment.
We should ensure we test a scenario where a named const string with a null value is used in a pattern. #Pending
| 1, | ||
| (byte)SignatureTypeCode.GenericMethodParameter, (byte)0, | ||
|
|
||
| // System_MemoryExtensions__AsSpanString |
There was a problem hiding this comment.
Should this be 'AsSpan_String'? #Pending
| WellKnownType.System_Runtime_CompilerServices_NativeIntegerAttribute, | ||
| WellKnownType.System_Runtime_CompilerServices_IsExternalInit | ||
| WellKnownType.System_Runtime_CompilerServices_IsExternalInit, | ||
| WellKnownType.System_MemoryExtensions |
There was a problem hiding this comment.
If the runtime exposes these APIs to us from a class we already take a type dependency on, that might be preferable, but I'm not sure operator == on Span would do what we want. For patterns I think we want to compare the values within the span to the characters of the string constant, but this operator just checks equality of the length and underlying reference.
| } | ||
|
|
||
| [Fact] | ||
| public void PatternMatchReadonlySpanCharOnConstantString() |
There was a problem hiding this comment.
| public void PatternMatchReadonlySpanCharOnConstantString() | |
| public void PatternMatchReadOnlySpanCharOnConstantString() | |
| ``` #Pending |
| class C | ||
| { | ||
| static void Main() | ||
| { |
| class C | ||
| { | ||
| static void Main() | ||
| { |
| // (5,49): error CS8652: The feature 'pattern matching ReadOnly/Span<char> on constant string' is currently in Preview and *unsupported*. To use Preview features, use the 'preview' language version. | ||
| // static bool M(Span<char> chars) => chars is ""; | ||
| Diagnostic(ErrorCode.ERR_FeatureInPreview, @"""""").WithArguments("pattern matching ReadOnly/Span<char> on constant string").WithLocation(5, 49)); | ||
| } |
| // (5,57): error CS8652: The feature 'pattern matching ReadOnly/Span<char> on constant string' is currently in Preview and *unsupported*. To use Preview features, use the 'preview' language version. | ||
| // static bool M(ReadOnlySpan<char> chars) => chars is ""; | ||
| Diagnostic(ErrorCode.ERR_FeatureInPreview, @"""""").WithArguments("pattern matching ReadOnly/Span<char> on constant string").WithLocation(5, 57)); | ||
| } |
| class C | ||
| { | ||
| static void Main() | ||
| { |
| /// This has a different diagnostic to <see cref="SwitchReadOnlySpanCharOnNull"/> because there is a valid implicit conversion from null to Span via Array, | ||
| /// but not to the ReadOnlySpan defined in <see cref="TestSources.Span"/> which has two implicit operators from reference types to ReadOnlySpan (for string and array). | ||
| /// When using the BCL, the diagnostic will be this one for both. | ||
| /// </summary> |
There was a problem hiding this comment.
We should test with the BCL types rather than TestSources, to avoid surprises (due to different implicit conversions, etc.) when compiling projects that are targeting the BCL.
Please consider changing the new tests in this file that depend on definitions of System.Span<T> and System.MemoryExtensions to compile against CoreCLR, with [ConditionalFact].
[ConditionalFact(typeof(CoreClrOnly))]
public void SwitchSpanCharOnNull()
{
var source = "...";
CreateCompilation(
source,
targetFramework: TargetFramework.NetCoreApp,
parseOptions: TestOptions.RegularPreview);
// ...
}
``` #Pending| } | ||
| }"; | ||
|
|
||
| public const string MemoryExtensions = @" |
|
@YairHalberstadt, sorry for the delay reviewing this PR. We've completed a couple of passes through the latest commit and expect to be able to review any subsequent commits promptly. For planning purposes, do you have a timeframe in mind when you expect to address the remaining feedback? If it might be more than a few days, we could merge this PR into a feature branch and assign someone else to address remaining items in separate PRs to that branch. Let us know how you'd like to proceed, thanks. |
|
@cston thanks for the review I don't think I'll be able to address the feedback in the next couple of weeks unfortunately. |
|
We'll address the remaining comments in a separate PR to the feature branch. |
cston
left a comment
There was a problem hiding this comment.
LGTM, assuming remaining issues are addressed in a follow-up PR.
|
Thanks @YairHalberstadt! |
Switching ReadOnlySpan on a constant string is championed at dotnet/csharplang#1881 and marked as AnyTime.
It seems inconsistent to be able to switch on ReadOnlySpan but not use it as a pattern, so I've added support for that too.
The approach is to generate a call to
MemoryExtensions.SequenceEqual(span, pattern.AsSpan()).This delegates to the same underlying method as
string.Equalsdoes on .Net Core so should be very efficient.For large switches, we use the same optimization as string, namely generate a hash function to switch on before doing the equality check. If required members for the hash function aren't there we fall back to a series of if else equality checks.
I've also added support for warning on e.g.
span is "foo" and "bar".Open Questions
Span<char>on a constant string?A) Yes
Relates to test plan: #59191