Skip to content

Regex auto-atomicity too aggressive about \w and \b #74686

@mathy-plutoflume

Description

@mathy-plutoflume

Description

System.Text.RegularExpressions.Regex.Matches not returning expected matches. It returns no matches at all.

Reproduction Steps

var stringToMatch = "sydney bogota berlin tokyo nairobi denver rio";
Console.WriteLine($"Input: {stringToMatch}");

GetMatches(@"(\b(?!bogo|nai)\w*\b)\w+", stringToMatch);
GetMatches(@"\w*\b\w+", stringToMatch);

void GetMatches(string expression, string stringToMatch)
{
    var reg = new System.Text.RegularExpressions.Regex(expression);

    Console.WriteLine($"\nOutput:");
    foreach (var match in reg.Matches(stringToMatch))
    {
        Console.WriteLine(match);
    }
}
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFrameworks>net6.0</TargetFrameworks>
    <ImplicitUsings>enable</ImplicitUsings>
    <LangVersion>latest</LangVersion>
  </PropertyGroup>
</Project>

Expected behavior

Input: sydney bogota berlin tokyo nairobi denver rio

Output:
sydney
berlin
tokyo
denver
rio

Output:
sydney
bogota
berlin
tokyo
nairobi
denver
rio

Example 1: https://regex101.com/r/Xn7X3w/2
Example 2: https://regex101.com/r/Q10441/1

Actual behavior

Input: sydney bogota berlin tokyo nairobi denver rio

Output:

Output:

Regression?

When running the same code pointing to netstandard2.0 and net472 it's returning correct matches.

Whereas .net5 and .net6.0 returns no match at all.

Known Workarounds

None

Configuration

  • .NET 6.0
  • Windows 11
  • x64

Other information

Other community members in Stackoverflow were also able to reproduce this issue: https://stackoverflow.com/questions/73498919/regex-match-differs-between-net-versions

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions