Skip to content

Conversation

@muellerj2
Copy link
Contributor

This removes the capturing group validity vectors in stack frames that were used to snapshot the validity/matched status of capturing groups whenever a stack frame was pushed. The restoration of this status is now achieved by processing new or modified opcodes during unwinding plus some changed logic when the pattern of a lookahead assertion matched:

  • The opcode _Capture_restore_end is split into _Capture_restore_matched_end and _Capture_restore_unmatched_end, which are pushed depending on whether a capturing group is already matched while the _N_end_capture node is processed. _Capture_restore_matched_end keeps the capturing group matched (so doesn't do anything about the matched status), _Capture_restore_unmatched_end resets it to unmatched.
  • The reset of the status to unmatched at the start of a loop in ECMAScript is now performed by _Matcher3::_Reset_capture_groups(), replacing the prior calls to std::fill(). This function pushes a new stack frame with opcode _Capture_restore_matched for every capture group whose status is changed to unmatched.
    • The new test cases fail when the status is reset to unmatched, but the reset is not undone when backtracking.
  • For successful positive lookahead assertions, we keep the stack frames with opcodes _Capture_restore_unmatched_end and any with opcode _Capture_restore_begin before them on the stack. We don't have to keep those with opcode _Capture_restore_matched_end because ECMAScript rules guarantee that the capturing groups inside the lookahead assertion are unmatched when processing the lookahead assertion starts, so the stack frame pushed for the first modification of a capturing group's end pointer in this lookahead assertion must have opcode _Capture_restore_unmatched_end.
  • For a failed negative lookahead assertion (i.e., one whose asserted pattern matched), we have to reset all the capturing groups to unmatched status. When going through the stack frames pushed by such an assertion, the code is changed to track the first and last matched capturing group inside this assertion, and then calls std::fill() to reset their status to unmatched. (We don't have to worry about restoring the begin and end pointers of the capturing groups because the capturing groups are always unmatched when leaving a negative lookahead assertion, so the pointers are meaningless.)

With this PR, the worst-case number of allocations is logarithmic in the size of the input (pattern + searched string) and no longer linear. But even for patterns like "a*", where the capture extent and validity vectors in the stack frames did not actually allocate, we still see some major performance improvement, probably because the overhead of managing these vectors is gone.

This change also makes the structure _Rx_state_frame trivially copyable and destructible iff the unwrapped iterator type is trivially copyable or destructible, usually simplifying destruction of the stack frame vector.

Drive-by change: Since the stack frames have a new member _Capture_idx, this member is now used to store the relevant index of the capturing group for all _Capture opcodes, so they no longer have to access the contents of the _Node_capture NFA node.

Benchmark

benchmark before [ns] after [ns] speedup
bm_match_sequence_of_as/"a*"/100 4324.78 2148.44 2.01
bm_match_sequence_of_as/"a*"/200 6975.45 3379.61 2.06
bm_match_sequence_of_as/"a*"/400 21972.4 5580.36 3.94
bm_match_sequence_of_as/"a*?"/100 2887.83 1967.08 1.47
bm_match_sequence_of_as/"a*?"/200 5312.5 3717.91 1.43
bm_match_sequence_of_as/"a*?"/400 10009.8 6835.94 1.46
bm_match_sequence_of_as/"(?:a)*"/100 4973.5 2622.77 1.90
bm_match_sequence_of_as/"(?:a)*"/200 7498.6 4237.58 1.77
bm_match_sequence_of_as/"(?:a)*"/400 23123.3 7952.01 2.91
bm_match_sequence_of_as/"(a)*"/100 42481.2 3989.95 10.65
bm_match_sequence_of_as/"(a)*"/200 69754.5 6835.94 10.20
bm_match_sequence_of_as/"(a)*"/400 125552 32994.1 3.81
bm_match_sequence_of_as/"(?:b|a)*"/100 6975.45 3923.69 1.78
bm_match_sequence_of_as/"(?:b|a)*"/200 12276.8 7149.83 1.72
bm_match_sequence_of_as/"(?:b|a)*"/400 25111.3 13183.5 1.90
bm_match_sequence_of_as/"(b|a)*"/100 35156.8 6417.41 5.48
bm_match_sequence_of_as/"(b|a)*"/200 71498.3 16043.5 4.46
bm_match_sequence_of_as/"(b|a)*"/400 141246 53013.4 2.66
bm_match_sequence_of_as/"(a)(?:b|a)*"/100 13671.9 4464.29 3.06
bm_match_sequence_of_as/"(a)(?:b|a)*"/200 23437.5 7672.99 3.05
bm_match_sequence_of_as/"(a)(?:b|a)*"/400 53013.4 14125.2 3.75
bm_match_sequence_of_as/"(a)(b|a)*"/100 30482.6 6406.25 4.76
bm_match_sequence_of_as/"(a)(b|a)*"/200 62779 14648.4 4.29
bm_match_sequence_of_as/"(a)(b|a)*"/400 128348 53013.4 2.42
bm_match_sequence_of_as/"(a)(?:b|a)*c"/100 14753 5161.83 2.86
bm_match_sequence_of_as/"(a)(?:b|a)*c"/200 26785.7 10253.9 2.61
bm_match_sequence_of_as/"(a)(?:b|a)*c"/400 54687.5 18415.3 2.97
Improvement beginning with #5865 (capture extent vector removal)
benchmark before [ns] after [ns] speedup
bm_match_sequence_of_as/"a*"/100 5859.38 2148.44 2.73
bm_match_sequence_of_as/"a*"/200 10009.8 3379.61 2.96
bm_match_sequence_of_as/"a*"/400 34379 5580.36 6.16
bm_match_sequence_of_as/"a*?"/100 3609.79 1967.08 1.84
bm_match_sequence_of_as/"a*?"/200 6696.43 3717.91 1.80
bm_match_sequence_of_as/"a*?"/400 13183.5 6835.94 1.93
bm_match_sequence_of_as/"(?:a)*"/100 6277.9 2622.77 2.39
bm_match_sequence_of_as/"(?:a)*"/200 10498 4237.58 2.48
bm_match_sequence_of_as/"(?:a)*"/400 36272.3 7952.01 4.56
bm_match_sequence_of_as/"(a)*"/100 20089.5 3989.95 5.04
bm_match_sequence_of_as/"(a)*"/200 38505 6835.94 5.63
bm_match_sequence_of_as/"(a)*"/400 102539 32994.1 3.11
bm_match_sequence_of_as/"(?:b|a)*"/100 9277.34 3923.69 2.36
bm_match_sequence_of_as/"(?:b|a)*"/200 17159.8 7149.83 2.40
bm_match_sequence_of_as/"(?:b|a)*"/400 39236.9 13183.5 2.98
bm_match_sequence_of_as/"(b|a)*"/100 24588.2 6417.41 3.83
bm_match_sequence_of_as/"(b|a)*"/200 43945.3 16043.5 2.74
bm_match_sequence_of_as/"(b|a)*"/400 97656.2 53013.4 1.84
bm_match_sequence_of_as/"(a)(?:b|a)*"/100 22949.2 4464.29 5.14
bm_match_sequence_of_as/"(a)(?:b|a)*"/200 41712.6 7672.99 5.44
bm_match_sequence_of_as/"(a)(?:b|a)*"/400 89979.2 14125.2 6.37
bm_match_sequence_of_as/"(a)(b|a)*"/100 22460.9 6406.25 3.51
bm_match_sequence_of_as/"(a)(b|a)*"/200 42968.8 14648.4 2.93
bm_match_sequence_of_as/"(a)(b|a)*"/400 96256.9 53013.4 1.82
bm_match_sequence_of_as/"(a)(?:b|a)*c"/100 22495.6 5161.83 4.36
bm_match_sequence_of_as/"(a)(?:b|a)*c"/200 45515.6 10253.9 4.44
bm_match_sequence_of_as/"(a)(?:b|a)*c"/400 100442 18415.3 5.45

@muellerj2 muellerj2 requested a review from a team as a code owner November 29, 2025 18:45
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews Nov 29, 2025
@StephanTLavavej StephanTLavavej added performance Must go faster regex meow is a substring of homeowner labels Nov 29, 2025
@StephanTLavavej StephanTLavavej self-assigned this Nov 29, 2025
@StephanTLavavej StephanTLavavej removed their assignment Dec 1, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Dec 1, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Dec 1, 2025
@StephanTLavavej StephanTLavavej merged commit b09d18a into microsoft:main Dec 1, 2025
45 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Dec 1, 2025
@StephanTLavavej
Copy link
Member

Thanks again - so excited to see the correctness and performance payoffs! 📈 ✅ 😻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Must go faster regex meow is a substring of homeowner

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

2 participants