`<regex>`: Some escape sequences are mishandled

There are a number of escape sequences that the parser mistakenly accepts or miscompiles.

## ECMAScript
* Backreferences with leading zero digits (e.g., `\01` for capture group 1) should be rejected. [ECMA-262 3rd ed., Section 15.10.2.11 "DecimalEscape"]
* `\00` and more zero digits should be rejected and not be interpreted as an escape for NUL. Only `\0` is a valid escape sequence for NUL. [ECMA-262 3rd ed., Section 15.10.2.11 "DecimalEscape"]
* When a custom traits implementation defines a new character class "z", `[\z]` matches the characters in this class and not the character z. (Meanwhile, `\z` without brackets matches the character z and not the characters in the class "z".) [ECMA-262 3rd ed., Sections 15.10.1 "Patterns" and 15.10.2.12 "CharacterClassEscape"]
* `[\b]` should match U+0008 BACKSPACE, not b. [ECMA-262 3rd ed., Section 15.10.2.19 "ClassEscape"]

## awk

See [Section "Regular expressions"](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04) in the awk specification.

* Octal escape sequences are not parsed correctly in square-bracket character class definitions. (E.g., `[\040]` should match U+0020 SPACE.)
* Similarly, `[\"]` and `[\/]` match backslashes as well even though they shouldn't.
* While the awk specification says that using unspecified escape sequences results in undefined behavior, I think we should reject them. (I believe we should handle this differently from ECMAScript mode, where unrecognized escape sequences just yield the escaped character.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`<regex>`: Some escape sequences are mishandled #5244

ECMAScript

awk

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

<regex>: Some escape sequences are mishandled #5244

Description

ECMAScript

awk

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`<regex>`: Some escape sequences are mishandled #5244