Skip to content

<regex>: Back-references to unmatched capture groups should not match in POSIX basic regular expressions #5374

@muellerj2

Description

@muellerj2

Describe the bug

According to Section 9.3.6 of the POSIX standard, back-references to unmatched capture groups ("subexpressions") should not match. However, MSVC STL's implementation currently matches such back-references to the empty string.

Test case

#include <iostream>
#include <regex>
#include <string>

using namespace std;

int main() {
        string pattern = R"(\(.\)*\1)";
        regex re{pattern, regex_constants::basic};
        cout << "regex '"<< pattern
             << "' matches the empty string: "
             << regex_match("", re);
}

https://godbolt.org/z/83nxo39qj

Expected behavior

The regular expression should not match.

Additional context

Note that the equivalent ECMAScript regex (.)*\1 does match the empty string, since the standard says that back-references to unmatched capture groups match the empty string.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingfixedSomething works now, yay!regexmeow is a substring of homeowner

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions