Use SearchValues for Setrep/Setloop regex interpreter opcodes by danmoseley · Pull Request #124630 · dotnet/runtime

danmoseley · 2026-02-20T07:56:17Z

Summary

Precompute SearchValues<char> for character class strings at regex construction time, and use them in the Setrep and Setloop/Setloopatomic interpreter opcode handlers to replace per-character CharInClass loops with vectorized SIMD-accelerated span operations.

Character classes that use Unicode categories, subtraction+negation, or have more than 128 characters fall back to the existing per-character path. A SetSearchValues wrapper struct encapsulates SearchValues<char> and the set's negation flag so the interpreter doesn't need to know whether the class was defined as negated.

This is a follow-up to #124628 which vectorized the Oneloop, Onerep, Notonerep, and MatchString opcodes.

Benchmark Results

Benchmark	Before (ns)	After (ns)	Ratio	Speedup
Setloop_AZ_64	115.12	47.04	0.41	2.4x
Setloop_AZ_256	290.71	47.36	0.16	6.1x
Setloop_Digit_256	285.67	51.36	0.18	5.6x
Setrep_AZ_64	91.59	30.67	0.33	3.0x
Setrep_AZ_256	247.85	32.99	0.13	7.5x

Benchmark code

using BenchmarkDotNet.Attributes;
using MicroBenchmarks;

namespace System.Text.RegularExpressions.Tests
{
    [BenchmarkCategory(Categories.Libraries, Categories.Regex)]
    public class Perf_Regex_Interpreter_Vectorize
    {
        // === Setloop: greedy character class loops like [a-z]+, \w+, [A-Za-z0-9]+ ===
        // These use SearchValues + IndexOfAnyExcept in the optimized path

        private Regex _setloopAZ64, _setloopAZ256, _setloopDigit256;

        [GlobalSetup(Target = nameof(Setloop_AZ_64))]
        public void Setup_Setloop_AZ_64() => _setloopAZ64 = new Regex("[a-z]+", RegexOptions.None);

        [GlobalSetup(Target = nameof(Setloop_AZ_256))]
        public void Setup_Setloop_AZ_256() => _setloopAZ256 = new Regex("[a-z]+", RegexOptions.None);

        [GlobalSetup(Target = nameof(Setloop_Digit_256))]
        public void Setup_Setloop_Digit_256() => _setloopDigit256 = new Regex("[0-9]+", RegexOptions.None);

        private const string LowerAZ64 = "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijkl"; // 64 chars
        private const string LowerAZ256 = LowerAZ64 + LowerAZ64 + LowerAZ64 + LowerAZ64;
        private const string Digits256 = "1234567890123456789012345678901234567890123456789012345678901234" +
                                         "1234567890123456789012345678901234567890123456789012345678901234" +
                                         "1234567890123456789012345678901234567890123456789012345678901234" +
                                         "1234567890123456789012345678901234567890123456789012345678901234";

        [Benchmark]
        public Match Setloop_AZ_64() => _setloopAZ64.Match(LowerAZ64);

        [Benchmark]
        public Match Setloop_AZ_256() => _setloopAZ256.Match(LowerAZ256);

        [Benchmark]
        public Match Setloop_Digit_256() => _setloopDigit256.Match(Digits256);

        // === Setrep: fixed-count character class like [a-z]{64}, [a-z]{256} ===
        // These use SearchValues + ContainsAnyExcept in the optimized path

        private Regex _setrepAZ64, _setrepAZ256;

        [GlobalSetup(Target = nameof(Setrep_AZ_64))]
        public void Setup_Setrep_AZ_64() => _setrepAZ64 = new Regex("[a-z]{64}", RegexOptions.None);

        [GlobalSetup(Target = nameof(Setrep_AZ_256))]
        public void Setup_Setrep_AZ_256() => _setrepAZ256 = new Regex("[a-z]{256}", RegexOptions.None);

        [Benchmark]
        public bool Setrep_AZ_64() => _setrepAZ64.IsMatch(LowerAZ64);

        [Benchmark]
        public bool Setrep_AZ_256() => _setrepAZ256.IsMatch(LowerAZ256);
    }
}

Precompute SearchValues<char> for character class strings at regex construction time. Use them in the Setrep and Setloop/Setloopatomic opcode handlers to replace per-character CharInClass loops with vectorized SIMD-accelerated span operations. Character classes that use Unicode categories, subtraction+negation, or have more than 128 characters fall back to the existing per-character path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

dotnet-policy-service · 2026-02-20T07:57:28Z

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Copilot

Pull request overview

This PR extends the regex interpreter’s SIMD/vectorization work by precomputing SearchValues<char> for eligible character-class strings at construction time, then using those precomputed matchers in the Setrep and Setloop/Setloopatomic opcode handlers to replace per-character CharInClass loops with span-based vectorized operations.

Changes:

Add RegexInterpreterCode.StringsSetSearchValues to precompute SearchValues<char> (plus negation) for small/enumerable character classes.
Introduce SetSearchValues helper struct to encapsulate the SearchValues<char> and negation semantics.
Update RegexInterpreter Setrep and Setloop handlers to use ContainsAnyExcept / IndexOfAnyExcept (or inverted forms for negated sets) when running left-to-right.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreterCode.cs	Precomputes `SearchValues<char>`-based matchers for eligible set strings and exposes them to the interpreter.
src/libraries/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreter.cs	Uses precomputed set matchers to vectorize `Setrep` and `Setloop` opcode execution for left-to-right matching.

...es/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreterCode.cs

danmoseley · 2026-02-20T08:21:54Z

Real-world impact estimate: Analyzing the 15,817 unique patterns in the regex test corpus (assuming interpreter engine):

Setloop ([a-z]+, [^:=]+, etc.): ~35% of patterns have explicit character class loops eligible for SearchValues vectorization. Actual speedup is input-length-dependent — the 2.4x–7.5x benchmark wins above reflect 64–256 char matches.
Setrep ([a-f0-9]{32}, etc.): ~2% of patterns have fixed-count character classes at 8+ chars.
Another ~35% of patterns use shorthand classes (\w+, \d+, \s+) which use Unicode categories — GetSetChars returns 0 for these so they fall back to per-character CharInClass. These could be a future optimization opportunity.

The Strings table contains both character class strings and Multi literal strings. Add validation that the string has a well-formed char-class encoding (valid flags byte, consistent lengths) before calling GetSetChars, which assumes well-formed input. Also clarify the comment about GetSetChars behavior for negated sets. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

A Multi literal starting with \0 and having an even-valued second byte and \0 at index 2 satisfies all CanEasilyEnumerateSetContents checks, causing GetSetChars to enumerate past the end of the string. This test verifies CreateSetSearchValues validates the encoding first. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

stephentoub · 2026-02-20T12:12:24Z

What is the impact on Regex construction time?

We can make the Regex interpreter much faster by not using the regex interpreter, e.g. implicitly adding .Compiled, so the main performance benefit of the interpreter is faster construction for cases where it will be rarely used.

Also, can you share the output of benchmarks for shorter runs in the input, like where a set+ will match only 1, 2, 3 characters?

stephentoub · 2026-02-20T12:23:57Z

@MihuBot benchmark Regex

MihuBot · 2026-02-20T14:01:53Z

See benchmark results at https://gist.github.com/MihuBot/e83bbd17cae6db6cc0670f22e9d19b38

danmoseley · 2026-02-20T18:47:17Z

@stephentoub Here are benchmarks addressing both concerns:

Construction cost:

Benchmark	Before (ns)	After (ns)	Ratio	Extra alloc
Ctor_NoCharClass	224	225	1.00	+56 B (5%)
Ctor_1CharClass	326	573	1.76	+88 B (6%)
Ctor_3CharClasses	768	1,457	1.90	+200 B (9%)
Ctor_5CharClasses	1,249	2,174	1.74	+304 B (10%)
Ctor_10CharClasses	2,025	3,578	1.77	+576 B (11%)
Ctor_RealWorld_Email	2,707	4,470	1.65	+240 B (4%)
Ctor_UnicodeCategories (`\w+\s+\d+`)	581	866	1.49	+104 B (6%)

SearchValues.Create adds ~200-250ns per character class. For a pattern with 10 classes, that's ~1.5µs extra construction cost.

Short match lengths ([a-z]+ matching N chars):

Match length	Before (ns)	After (ns)	Ratio
1	42.4	42.1	0.99
2	43.0	43.7	1.02
3	45.2	43.1	0.95
8	50.0	44.0	0.88
16	56.6	44.4	0.79

Short matches (1-3 chars) are neutral — no regression. The SearchValues dispatch overhead is negligible because CharInClass per iteration is already relatively expensive (ASCII table lookup + bit ops). Wins start at ~8 chars.

So the tradeoff is ~1.7x slower construction vs 2-7x faster matching for longer character class runs. For a rarely-used regex where construction dominates, this is a net cost.

Benchmark code

[BenchmarkCategory(Categories.Libraries, Categories.Regex)]
public class Perf_Regex_Interpreter_SearchValues_Impact
{
    [Benchmark]
    public Regex Ctor_NoCharClass() => new Regex("hello world");

    [Benchmark]
    public Regex Ctor_1CharClass() => new Regex("[a-z]+");

    [Benchmark]
    public Regex Ctor_3CharClasses() => new Regex("[a-z]+[0-9]+[A-Z]+");

    [Benchmark]
    public Regex Ctor_5CharClasses() => new Regex("[a-z]+[0-9]+[A-Z]+[a-f]+[!@#]+");

    [Benchmark]
    public Regex Ctor_10CharClasses() => new Regex("[a-z]+[0-9]+[A-Z]+[a-f]+[!@#]+[g-m]+[4-8]+[N-T]+[x-z]+[,;:]+");

    [Benchmark]
    public Regex Ctor_RealWorld_Email() => new Regex(@"[a-z0-9]+(?:\.[a-z0-9]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?");

    [Benchmark]
    public Regex Ctor_UnicodeCategories() => new Regex(@"\w+\s+\d+");

    private Regex _setloop;

    [GlobalSetup(Targets = new[] {
        nameof(Setloop_Match1), nameof(Setloop_Match2), nameof(Setloop_Match3),
        nameof(Setloop_Match8), nameof(Setloop_Match16) })]
    public void Setup_Setloop() => _setloop = new Regex("[a-z]+", RegexOptions.None);

    [Benchmark]
    public Match Setloop_Match1() => _setloop.Match("a.");

    [Benchmark]
    public Match Setloop_Match2() => _setloop.Match("ab.");

    [Benchmark]
    public Match Setloop_Match3() => _setloop.Match("abc.");

    [Benchmark]
    public Match Setloop_Match8() => _setloop.Match("abcdefgh.");

    [Benchmark]
    public Match Setloop_Match16() => _setloop.Match("abcdefghijklmnop.");
}

danmoseley · 2026-02-20T19:00:18Z

So maybe this comes down to whether we prefer to optimize for

quick, throwaway interpreted Regexes newed up many times (not reused) that have several char classes and run on small inputs only so construction time is significant. these will regress, though absolute size of regression is around a microsecond each.
or
interpreted Regexes that spend significant time matching char classes. these will improve with no upward limit on the improvement. pathological cases can get dramatically better.

I'm inclined to say it's a worthwhile tradeoff because it can improve performance dramatically, without limit in cases where significant time is being spent, against a very small construction impact.

In general changing from interpreted to generated (if pattern is static which is generally the case) is a no brainer improvement over interpreted. So the relevant scenario here is long standing libraries that are not being actively maintained. (I'm ignoring throwaway code where they're using interpreted just to type slightly less). Not sure how that impacts the calculus here. It is a good reason for caring about optimizing interpreter even though it's rarely the right choice to use for perf (not suggesting you're arguing otherwise)

thoughts?

danmoseley · 2026-02-20T19:01:40Z

Analysis of the MihuBot benchmark results, focusing on interpreter (Options=None):

Matching wins (Sherlock corpus, interpreter only):

Pattern	Main (ns)	PR (ns)	Ratio
`[a-q][^u-z]{13}x`	35,355	32,073	0.91
`(?s).*`	977,442	894,681	0.92
`[a-zA-Z]+ing`	9,478,381	8,917,699	0.94
`\s[a-zA-Z]{0,12}ing\s`	10,070,462	9,533,484	0.95
`\w+\s+Holmes`	7,916,515	7,599,869	0.96
`Sher[a-z]+\|Hol[a-z]+`	123,404	121,523	0.98
SliceSlice IgnoreCase	678.2 ms	628.7 ms	0.93

Neutral / no benefit (as expected — \w, \d, \s use Unicode categories, no SearchValues):

Pattern	Ratio
`\w+`	1.02
`\b\w+n\b`	0.99
`\p{L}`	1.01
`Huck[a-zA-Z]+\|Saw[a-zA-Z]+`	1.00

Construction overhead (Mariomkas, real-world patterns):

Pattern	Ctor Main (µs)	Ctor PR (µs)	Ratio
IP validator (87 chars, many classes)	5.47	5.71	1.04
URL pattern (51 chars)	2.58	2.63	1.02
Email pattern	1.45	1.46	1.01

The real-world construction overhead is only 1-4%, much smaller than the ~1.7x my micro-benchmarks showed for minimal patterns. This is because SearchValues.Create is a small fraction of total construction cost once you include parsing, tree optimization, and opcode generation.

Summary: 5-9% matching wins for explicit character class patterns on realistic inputs, 1-4% construction overhead for real-world patterns, neutral for \w/\d/\s patterns.

MihaZupan · 2026-02-20T23:15:36Z

If this goes beyond just the process startup overhead and you're creating SearchValues often, we could look into optimizing the Create more.
I did a simple test of vectorizing the min/max computation we do as the first step in #124667, and that makes Create ~2x cheaper (for a set with 64 chars). EgorBot/runtime-utils#654 (comment)

danmoseley · 2026-02-20T23:28:11Z

@MihaZupan interesting, any reason we shouldn't take that change? seems pretty localized complexity.

otherwise, this is ready for review I think. (optionally, we could wait on change to SearchValues proposed)

MihaZupan · 2026-02-23T13:48:41Z

Just that it's more (unsafe) code. I wouldn't block any changes on that though.

danmoseley · 2026-02-23T20:43:17Z

Just that it's more (unsafe) code. I wouldn't block any changes on that though.

true. agreed

danmoseley · 2026-03-12T19:51:09Z

I will do an experiment of the worst case -- one shot, new Regex(pattern).IsMatch(input) (where interpreter is likely common, and any constrution perf regression would be most impactful), using various real world patterns with both short and long inputs, match and non match.

Then measure with BDN the overall cost of new Regex(pattern).IsMatch(input) in each case. This combines construction+search. If generally faster - this change is a win as search dominates the construction cost. If generally slower - we should not take it.

danmoseley · 2026-03-13T12:25:05Z

SearchValues One-Shot Regex Performance Report

PR: #124630 — Add SearchValues.Create() precomputation to regex interpreter
Date: 2026-03-12/13
Machine: Windows 11, Intel Core i9-14900K, X64 RyuJIT AVX2

Executive Summary

The PR adds SearchValues.Create() precomputation during regex construction for character classes
with explicit ranges. This costs extra construction time (~10-30%) but can dramatically speed up
matching (up to 95% faster) for patterns with eligible char classes on long inputs.

Key question from Stephen Toub: Does the construction overhead pay for itself in one-shot
(construct + match) scenarios?

Answer: It depends on input length and pattern type:

Long inputs (2000+ chars): YES for most eligible patterns — OneShot is 28-95% faster
Short/medium inputs (<100 chars): NO — OneShot is 5-50% slower due to construction overhead
Some patterns regress regardless — negated char classes and patterns where char-class search
isn't the bottleneck show pure overhead

Methodology

20 real-world patterns from Regex_RealWorldPatterns.json, selected by NuGet download count
Grouped by eligible char class count: A (0, control), B (1), C (2-3), D (4+)
6 input tiers per pattern: VeryShortNonMatch(5), ShortMatch(30), MediumNonMatch(100),
MediumMatch(100), LongNonMatch(2000), LongMatch(2000)
A/B via BDN --coreRun with baseline at merge-base ab11a456596 and PR at 195e844920e
DLL hashes verified before and after: baseline DF945E46... vs PR EC83F5F6... (confirmed different)
18/20 patterns completed successfully (C4, D4 failed due to input validation bugs)

Group Average OneShot Ratios (PR / Baseline)

< 1.0 = PR faster, > 1.0 = PR slower

Group	VShort NM	Short M	Med NM	Med M	Long NM	Long M	Count Long	Construct
A: 0 eligible (control)	1.02	1.04	0.99	1.03	1.01	1.05	1.02	1.03
B: 1 eligible	1.11	1.09	1.07	1.08	0.94	0.90	0.87	1.15
C: 2-3 eligible	1.07	1.04	1.07	1.02	0.96	0.88	0.90	1.10
D: 4+ eligible	1.12	1.05	0.96	1.05	0.60	0.72	0.58	1.17

Group A (control) shows ~1-4% noise, confirming no regression for patterns with 0 eligible classes.

Group D (4+ eligible) shows the clearest trend: Long-NonMatch averages 0.60 (40% faster),
Count-LongMatch averages 0.58 (42% faster).

Construction Overhead

Group	Avg Construct Ratio	Interpretation
A (control)	1.026	~2.6% noise — no meaningful overhead
B (1 eligible)	1.154	~15% overhead
C (2-3 eligible)	1.102	~10% overhead
D (4+ eligible)	1.170	~17% overhead

Absolute overhead ranges from ~1ns (A1) to +1.8us (D3 email validation).
Typical: +48-214ns for most patterns.

MatchOnly Ratios Reveal the Truth

MatchOnly benchmarks use a pre-constructed Regex, isolating the matching speedup from construction overhead.
This is the clearest signal of whether SearchValues helps a pattern.

Patterns where matching is dramatically faster (MatchOnly LongMatch ratio):

Pattern	Regex	MatchOnly Long	OneShot Long	Why
B4	`^(?<LINE>[0-9]*)$`	0.037 (96x)	0.320 (3x)	Simple `[0-9]*` — SearchValues dominates
C2	`^...[0-9]-...[0-9]$`	0.042 (24x)	0.432 (2.3x)	Two `[0-9]*` groups
D5	`^[A-Za-z0-9-_]+\....`	0.046 (22x)	0.448 (2.2x)	Large char class with `+` quantifier
D2	`^...[0-9],...[0-9],...`	0.053 (19x)	0.534 (1.9x)	Four `[0-9]*` groups
D1	complex CSV-like	0.054 (18x)	0.052 (19x)*	Multiple char classes, long scan

*D1 LongNonMatch: 1.8ms down to 94.1us — the biggest absolute win

Patterns where matching is NOT faster (MatchOnly ~1.0):

Pattern	Regex	MatchOnly Long	OneShot Long	Why
B1	`[^a-zA-Z0-9_.]`	0.995	1.166	Negated class — SearchValues doesn't help
B3	`^[.]`	1.005	0.973	Single char — trivial class
C3	`(?<lang>[a-z]{2,8})...`	0.985	1.082	Bounded quantifier `{2,8}` — few iterations
D3	email validation	0.955	1.171	Complex pattern — backtracking dominates

Key Individual Results

Biggest Wins (OneShot)

Pattern	Input Tier	Baseline	PR	Ratio	Saved
D1	LongNonMatch	1.8 ms	94.1 us	0.052	1.72 ms
D1	Count_Long	376.9 us	55.6 us	0.148	321 us
B4	LongMatch	2.6 us	833 ns	0.320	1.77 us
C2	Count_Long	3.0 us	1.3 us	0.427	1.7 us
D5	LongMatch	2.8 us	1.3 us	0.448	1.5 us
B4	LongNonMatch	1.7 us	811 ns	0.477	889 ns
D1	MediumNonMatch	9.5 us	4.7 us	0.499	4.8 us

Biggest Regressions (OneShot)

Pattern	Input Tier	Baseline	PR	Ratio	Added
B1	VeryShortNM	463.9 ns	695.4 ns	1.499	+232 ns
B1	MediumNonMatch	531.7 ns	775.7 ns	1.459	+244 ns
D3	VeryShortNM	6.5 us	8.1 us	1.244	+1.6 us
C5	MediumNonMatch	602.3 ns	718.4 ns	1.193	+116 ns

B1 stands out: [^a-zA-Z0-9_.] (negated class) has 56.5% construction overhead with zero
matching benefit. This is the worst pattern — pure regression on all input sizes.

Conclusions

1. The optimization works spectacularly for the right patterns

Patterns with unbounded quantifiers (*, +) on non-negated char classes see 90-96% faster
matching on long inputs. Even in one-shot mode (including construction overhead), these are
50-95% faster on long inputs.

2. Construction overhead is real but modest

~10-17% construction overhead for eligible patterns. In absolute terms, typically 50-200ns.
Group A (control) confirms this is specific to eligible patterns, not a general regression.

3. Short-input regression is consistent

For inputs under ~100 chars, construction overhead dominates and OneShot is ~5-15% slower.
This is the unavoidable cost of precomputation.

4. Negated classes are a concern

B1 ([^a-zA-Z0-9_.]) shows 50% regression with zero benefit. The PR should consider
excluding negated char classes from SearchValues optimization, or the overhead should be
investigated.

5. Breakeven analysis

The one-shot breakeven point depends on the pattern, but roughly:

Patterns with large char class + unbounded quantifier: breakeven at ~100-500 chars
Patterns with small/few eligible classes: breakeven at ~1000-2000 chars
Patterns with negated classes or bounded quantifiers: may never break even

6. Overall verdict

For the interpreter one-shot path, the optimization is a net positive for real-world workloads
where inputs tend to be medium-to-long. The dramatic wins on long inputs (19x for D1!) far
outweigh the modest short-input overhead. However, negated char classes should be investigated
as a potential exclusion.

Failed Patterns

C4 (Content-Disposition=...): LongMatch input didn't actually match the regex — input data bug
D4 (IP address validation): VeryShortNonMatch input accidentally matched — input data bug

These are test harness issues, not PR issues. 18/20 patterns provide solid data.

Appendix: All OneShot Ratios

Pat	VShort NM	Short M	Med NM	Med M	Long NM	Long M	Count	Construct	Pattern
A1	1.053	1.035	1.035	1.044	1.001	1.006	1.008	1.003	`\s+`
A2	1.016	1.065	1.049	1.069	1.069	1.089	0.937	1.079	`($.*?$)`
A3	1.030	1.043	1.017	1.006	1.001	1.101	1.102	0.984	`<.*>`
A4	1.002	1.057	0.979	1.036	1.009	1.022	1.048	1.041	`\%(\d+)!.*?!`
A5	1.001	1.006	0.886	0.979	0.987	1.018	0.995	1.024	`^[^ ]*$`
B1	1.499	1.416	1.459	1.452	1.125	1.166	1.093	1.565	`[^a-zA-Z0-9_.]`
B2	1.015	1.065	1.015	1.040	1.016	1.023	0.979	1.027	`^-?([^-+/*\^\s]+)`
B3	0.943	0.979	0.980	0.962	1.010	0.973	0.930	0.992	`^[.]`
B4	1.062	0.993	0.930	0.939	0.477	0.320	0.325	1.122	`^(?<LINE>[0-9]*)$`
B5	1.035	1.007	0.956	1.025	1.054	1.006	1.011	1.066	`^\{([^\} ]\|\}\})*\}$`
C1	0.996	1.032	0.997	1.015	0.988	0.986	1.069	0.993	CLI flag parser
C2	1.031	1.006	0.971	0.907	0.596	0.432	0.427	1.039	Line range `[0-9]-[0-9]`
C3	1.096	1.078	1.115	1.099	1.091	1.082	1.063	1.155	Language tag
C5	1.155	1.051	1.193	1.046	1.145	1.027	1.054	1.223	`trackId[^0-9]([0-9])`
D1	1.011	0.983	0.499	0.999	0.052	0.724	0.148	1.041	CSV-like parser
D2	1.028	0.997	1.016	0.985	0.708	0.534	0.540	1.061	Line,col range
D3	1.244	1.177	1.180	1.159	0.984	1.171	1.192	1.284	Email validation
D5	1.186	1.054	1.142	1.049	0.642	0.448	0.449	1.292	JWT/dotted name

danmoseley · 2026-03-13T13:55:43Z

Follow-up on the one-shot benchmark results: negated class overhead

While investigating why [^a-zA-Z0-9_.] showed 56% construction overhead with zero matching benefit, I looked at the code paths.

CreateSetSearchValues() builds a SearchValues for every eligible char class string unconditionally. But on the matching side, only the bulk-scanning opcodes use it (Setloop, Setloopatomic, Setrep). The single-char Set opcode and lazy Setlazy skip it entirely — reasonably, since vectorized search over 1 char isn't useful.

So for patterns like [^a-zA-Z0-9_.] (no quantifier → Set opcode), the SearchValues.Create() cost is pure overhead. This isn't specific to negation — it affects any unquantified or lazy-quantified char class.

A possible fix: only create SearchValues for class strings that are actually referenced by Setloop/Setrep/Setloopatomic opcodes.

danmoseley · 2026-03-13T13:56:25Z

Overall construction overhead is a net loss for short inputs across the board, and even for eligible patterns the benefit only kicks in on longer inputs.

Closing

Copilot AI review requested due to automatic review settings February 20, 2026 07:56

github-actions bot added the area-System.Text.RegularExpressions label Feb 20, 2026

dotnet-policy-service bot assigned danmoseley Feb 20, 2026

Merge branch 'main' into interpreter-searchvalues

47f7e74

Copilot started reviewing on behalf of danmoseley February 20, 2026 07:57 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

...es/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreterCode.cs Show resolved Hide resolved

...es/System.Text.RegularExpressions/src/System/Text/RegularExpressions/RegexInterpreterCode.cs Outdated Show resolved Hide resolved

danmoseley mentioned this pull request Feb 20, 2026

Vectorize RegexInterpreter opcode loops for Oneloop, Onerep, Notonerep, and MatchString #124628

Open

danmoseley and others added 2 commits February 20, 2026 01:28

Copilot AI review requested due to automatic review settings February 20, 2026 08:41

Copilot started reviewing on behalf of danmoseley February 20, 2026 08:41 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

build-analysis bot mentioned this pull request Feb 20, 2026

[android][clr] No peer certificates when executing System.Net.Http.Functional.Tests on Android emulator #124526

Open

MihuBot mentioned this pull request Feb 20, 2026

[Benchmark X64] [danmoseley] Use SearchValues for Setrep/Setloop regex inter ... MihuBot/runtime-utils#1774

Open

danmoseley closed this Mar 13, 2026

Conversation

danmoseley commented Feb 20, 2026

Summary

Benchmark Results

Uh oh!

dotnet-policy-service bot commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

danmoseley commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

stephentoub commented Feb 20, 2026

Uh oh!

stephentoub commented Feb 20, 2026

Uh oh!

MihuBot commented Feb 20, 2026

Uh oh!

danmoseley commented Feb 20, 2026

Uh oh!

danmoseley commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danmoseley commented Feb 20, 2026

Uh oh!

MihaZupan commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danmoseley commented Feb 20, 2026

Uh oh!

MihaZupan commented Feb 23, 2026

Uh oh!

danmoseley commented Feb 23, 2026

Uh oh!

danmoseley commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danmoseley commented Mar 13, 2026

SearchValues One-Shot Regex Performance Report

Executive Summary

Methodology

Group Average OneShot Ratios (PR / Baseline)

Construction Overhead

MatchOnly Ratios Reveal the Truth

Key Individual Results

Biggest Wins (OneShot)

Biggest Regressions (OneShot)

Conclusions

1. The optimization works spectacularly for the right patterns

2. Construction overhead is real but modest

3. Short-input regression is consistent

4. Negated classes are a concern

5. Breakeven analysis

6. Overall verdict

Failed Patterns

Appendix: All OneShot Ratios

Uh oh!

danmoseley commented Mar 13, 2026

Uh oh!

danmoseley commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

danmoseley commented Feb 20, 2026 •

edited

Loading

MihaZupan commented Feb 20, 2026 •

edited

Loading

danmoseley commented Mar 12, 2026 •

edited

Loading