Performance optimizations for EscapingUtilities#13426
Conversation
- Convert Fact-based tests to Theory with InlineData - Replace xUnit assertions with Shouldly - Remove empty XML doc comments and #nullable disable - Use expression-bodied test methods
Cover UnescapeAll, Escape, EscapeWithCaching, ContainsEscapedWildcards, and round-trip scenarios with MemoryDiagnoser to establish a performance baseline before optimization.
Many thanks to @JeremyKuhne for providing this BufferScope<T> implementation.
- Make public-facing methods `public` (was `internal`) - Enable nullable reference types; add `[return: NotNullIfNotNull]` to `UnescapeAll` and `Escape` - Use file-scoped namespace - Merge `Escape` + `EscapeWithCaching` into `Escape(string? value, bool cache = false)` - Inline `AppendEscapedChar` and `AppendEscapedString` helpers into `Escape` - Rename field `s_unescapedToEscapedStrings` → `s_escapedStringCache` - Rename parameters to `value` consistently across all public methods - Rename local variables for clarity (`percentIndex`, `startIndex`, `endIndex`, `hi`/`lo`, `specialCharIndex`, etc.) - Use collection expression for `s_charsToEscape`; target-typed `new` for cache dictionary - Use expression bodies for simple members (`HexDigitChar`, `Escape`, etc.) - Use relational pattern matching in `TryDecodeHexDigit`; list patterns in `ContainsEscapedWildcards` - Use `do...while` in `UnescapeAll` since `percentIndex` is already known non-negative on entry - Replace `ch / 0x10` with `ch >> 4` in nibble extraction - Clean up XML doc comments throughout - Remove stale and redundant inline comments
- Address unnecessary allocation when a string contains a '%' but no valid escape sequence. - Add test for edge case when a string contains a '%' but no valid escape sequence and 'trim' is set to true. In that case, the trimmed string should still be returned.
- Add System.Buffers.SearchValues<char> field initialized from s_charsToEscape (#if NET) - Extract IndexOfAnyEscapeChar helper to abstract the platform difference, keeping Escape itself free of #if guards - Fall back to char[] / IndexOfAny on .NET Framework
- All chars in s_charsToEscape fall within the ASCII range ['$' (0x24) .. '@' (0x40)] - Encode membership as a 29-bit uint bitmask indexed by (c - '$') - Replace O(n×k) IndexOfAny array scan with an O(n) range check + bit test per char - Eliminates managed-to-native transition overhead on each IndexOfAnyEscapeChar call - No change on .NET (SearchValues path unchanged)
- Capture the result of the initial IndexOfAnyEscapeChar(value) fast-path check rather than discarding it and repeating the same scan at the start of the loop - Switch from while(true)/break to do...while since specialCharIndex >= 0 is established before entering the loop, eliminating a redundant branch per iteration
Collect all special char positions in a first pass using RefArrayBuilder<int>, compute the exact output length, then write directly into a freshly-allocated string with no intermediate buffer:
- On .NET: string.Create writes directly into the output string without an intermediate buffer
- On .NET Framework: new string('\0', length) + Buffer.MemoryCopy via unsafe fixed pointers for fast, native-speed chunk copies
- Extract cache operations into TryGetFromCache and AddToCache helpers
- Remove StringBuilderCache dependency from Escape entirely
- Compute trim bounds (startIndex/endIndex) before searching for '%' so the initial scan and inner-loop scan skip leading/trailing whitespace - Scope the inner-loop IndexOf to [percentIndex+1, endIndex) to match - Extract GetDefaultResult static local to deduplicate the no-escape- sequences return path (used in both the early-exit and sb-is-null cases)
- Add HexConverter with a 256-entry hex digit table (ReadOnlySpan<byte> on .NET; static byte[] on Framework — the Framework JIT can't elide bounds checks on ReadOnlySpan<byte> like the .NET Core JIT can) - TryDecodeHexDigit delegates to HexConverter.FromChar
e06a889 to
2ff888d
Compare
There was a problem hiding this comment.
Pull request overview
This pull request refactors Microsoft.Build.Shared.EscapingUtilities hot paths (Escape, UnescapeAll) and adds supporting low-allocation infrastructure (pooled buffer + ref-struct array builder) plus benchmarks and expanded unit tests to validate the new behavior and performance characteristics.
Changes:
- Rewrite
EscapingUtilities.EscapeandEscapingUtilities.UnescapeAllto reduce allocations and improve throughput (including new fast paths and cache-aware escape). - Add new low-allocation helper types (
BufferScope<T>,RefArrayBuilder<T>,HexConverter,TypeInfo<T>) plus polyfills needed for newer language/runtime features across TFMs. - Add/expand benchmarks and unit tests for escaping/unescaping and the new infrastructure.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/Framework/EscapingUtilities.cs | Rewrites escape/unescape implementations for fewer allocations and faster scanning/encoding. |
| src/Framework/Utilities/BufferScope.cs | Adds pooled-buffer ref struct to support temporary storage with minimal allocations. |
| src/Framework/Collections/RefArrayBuilder.cs | Adds ref-struct builder over BufferScope<T> for efficient index collection / array building. |
| src/Framework/Utilities/HexConverter.cs | Adds lookup-table hex decoding helper used by unescape fast path. |
| src/Framework/Utilities/TypeInfo.cs | Adds type inspection helper used to decide whether to clear pooled arrays on return. |
| src/Framework/Polyfills/UnscopedRefAttribute.cs | Adds polyfill / forwarder for UnscopedRefAttribute across target frameworks. |
| src/Framework/Polyfills/StringExtensions.cs | Adds string null/empty/whitespace extension helpers used by updated code paths. |
| src/MSBuild.Benchmarks/EscapingUtilitiesBenchmark.cs | Adds BenchmarkDotNet coverage for Escape, UnescapeAll, and related helpers. |
| src/Framework.UnitTests/EscapingUtilities_Tests.cs | Expands and modernizes escaping/unescaping tests (Shouldly + theories). |
| src/Framework.UnitTests/BufferScopeTests.cs | Adds test coverage for pooled buffer behavior, growth, pinning, and disposal. |
| src/Framework.UnitTests/RefArrayBuilder_Tests.cs | Adds extensive coverage for builder operations (add/insert/remove/range/etc.). |
| src/Framework.UnitTests/TypeInfoTests.cs | Adds coverage for TypeInfo<T>.IsReferenceOrContainsReferences() behavior and caching. |
| src/Utilities/TaskItem.cs | Updates call site to use Escape(..., cache: true) instead of EscapeWithCaching. |
| src/Shared/TaskParameter.cs | Updates call site to use Escape(..., cache: true) instead of EscapeWithCaching. |
| src/Build/BackEnd/TaskExecutionHost/TaskExecutionHost.cs | Updates call site to use Escape(..., cache: true); removes trailing whitespace. |
| src/Utilities.UnitTests/StringExtensions_Tests.cs | Disambiguates StringExtensions.Replace call site due to new Microsoft.Build.StringExtensions. |
|
Looks great. Trying to better understand the intent here. Did this path show up hot on a perf trace somewhere? |
I added some quick-and-dirty and found that these methods are called a lot. In my tests, I built a significant part of Roslyn and found 3,000,000+ invocations of these methods. I'm definitely intending to reduce the number of times these are called, but I decided to improve the performance of the methods themselves as a first step. |
Ensure that Insert(...) only takes the fast path if "_count < _scope.Length" to potentially shifting past the end of the scope. Added tests to cover this issue, which is more likely to appear with a stack-allocated buffer is used, since a rented array might actually be larger than the requested minimum length.
|
Sweet. Are these changes already noticeable end-to-end in the roslyn build (the eval part of it)? |
What metric did you mean? Obviously, this change is reducing milliseconds and lowering allocations. Many such changes in aggregate would likely show wall clock improvements. |
I missed adding string resources to SR.resx when cherry-picking RefArrayBuilder from another local branch.
There was a problem hiding this comment.
Performance & Allocation Awareness
All other patterns are clean:
- Two-pass Escape (
string.Create/unsafe fixed) — exact-size allocation, no waste. ✅ - Lazy
StringBuilderinUnescapeAll— acquire/release pattern is correct;sbis only acquired on the first valid%XXhit, so purely-invalid or escape-free strings allocate nothing. ✅ IndexOfAnyEscapeChar—SearchValues<char>on .NET (SIMD), bitmask scan on .NET Framework. Both are tighter than the oldIndexOfAny(char[]). ✅HexConverter—ReadOnlySpan<byte>from PE data section on .NET (no allocation);static readonly byte[]on .NET Framework (single allocation, cached). ✅BufferScope.Dispose— clears only whenTypeInfo<T>.IsReferenceOrContainsReferences(), avoiding unnecessary zeroing for value types likeint. ✅
One issue raised inline: RefArrayBuilder(initialCapacity: 64) always rents from ArrayPool<int>.Shared even for the common case of 1–2 special chars. A stackalloc int[8] scratch buffer via the existing Span<T> constructor eliminates that overhead for the common case.
Generated by Expert Code Review (command) for issue #13426 · ● 26.6M
- Change TypeInfo<T> from public to internal - Remove unused "type" variable - Prefer null-coalescing assignment operator
Thanks for catching this! It's now fixed.
This is intentional. Because it is tricky to implement,
Fixed.
The initial capacity has been reduced from 64 to 16. However, passing |
|
@JanKrivanek: The expert code review feedback has been addressed. |
|
@JeremyKuhne: Here's the PR that includes |
Replace hand-written [DllImport] declarations and COM [ComImport]
interfaces with CsWin32 source-generated interop. This enables
Native AOT scenarios where built-in COM marshalling and
System.Runtime.InteropServices marshalling infrastructure do not
exist. The approach follows patterns established in WinForms, WPF,
and the dotnet/sdk repository.
CsWin32 is configured in Microsoft.Build.Framework with
allowMarshaling: false and useSafeHandles: false, producing raw
pointer signatures that are AOT-compatible. Other projects consume
generated types via InternalsVisibleTo.
Key changes:
Infrastructure:
- Add FEATURE_WINDOWSINTEROP compile-time gate in
Directory.BeforeCommon.targets, disabled for source-only builds
- Add CsWin32 (Microsoft.Windows.CsWin32) and PolySharp package
references to Framework.csproj
- Add NativeMethods.txt and NativeMethods.json for CsWin32
configuration
- Add .editorconfig suppression for CS3019 in Windows/ folders
- Add cswin32-interop skill document for agent guidance
New utility types:
- BufferScope<T>: ref struct for stack-allocated buffers with
ArrayPool fallback, replacing manual ArrayPool rent/return and
stackalloc patterns throughout interop code
- TypeInfo<T>: cached RuntimeHelpers.IsReferenceOrContainsReferences
polyfill for net472
- ComScope<T>: COM pointer lifetime management (using-disposable)
- ComClassFactory: AOT-compatible COM activation without
Activator.CreateInstance
- IID: generic IID lookup via IComIID interface
- VARIANT, BSTR, HRESULT, FILETIME extensions: CsWin32 partial
type augmentations for safe usage patterns
NativeMethods.cs cleanup:
- Delete ~130 hand-written constants, enums, structs, and
[DllImport] declarations replaced by CsWin32 typed equivalents
- Replace MemoryStatus class with MEMORYSTATUSEX struct via
TryGetMemoryStatus
- Replace GetCurrentDirectory/GetFullPath/GetShortPathName/
GetLongPathName with BufferScope-based implementations
- Replace StreamHandleType enum with bool useStandardError parameter
- Delete DirectoryExists/FileExists/FileOrDirectoryExists wrappers;
callers use PInvoke.GetFileAttributes directly
- Delete HResultSucceeded/HResultFailed; callers use HRESULT.Failed
- Add [UnsupportedOSPlatformGuard("windows")] to IsUnixLike
- Version all [SupportedOSPlatform] attributes to "windows6.1"
COM interop modernization:
- Replace [ComImport] WMI interfaces (IWbemLocator, IWbemServices,
IEnumWbemClassObject, IWbemClassObject) with struct-based COM
using delegate* unmanaged[Stdcall] vtables
- Add IDebugClient4-based command line retrieval as alternative to
WMI (CommandLineSource.DebugEngine)
- Move IFixedTypeInfo from NativeMethods.cs to its own file in Tasks
Platform guard pattern:
- Apply dual-guard pattern: #if FEATURE_WINDOWSINTEROP wraps runtime
IsWindows checks, eliminating dead code in source builds
- Use IsUnixLike (positive guard) instead of !IsWindows for CA1416
platform compatibility analysis
- Files excluded at project level via <Compile Remove> when
FeatureWindowsInterop is off (WindowsFileSystem, Windows/ folder,
WMI structs)
Behavioral changes:
- Unzip.cs: !IsWindows → IsUnixLike for correct CA1416 analysis
- ProcessExtensions: ArrayPool<T> replaced with BufferScope<T>
- AssemblyInformation/GlobalAssemblyCache: remove dead non-Windows
code paths in net472-only (#if !FEATURE_ASSEMBLYLOADCONTEXT)
blocks, which only run on Windows
Some related changes are also in dotnet#13426.
Replace hand-written [DllImport] declarations and COM [ComImport]
interfaces with CsWin32 source-generated interop. This enables
Native AOT scenarios where built-in COM marshalling and
System.Runtime.InteropServices marshalling infrastructure do not
exist. The approach follows patterns established in WinForms, WPF,
and the dotnet/sdk repository.
CsWin32 is configured in Microsoft.Build.Framework with
allowMarshaling: false and useSafeHandles: false, producing raw
pointer signatures that are AOT-compatible. Other projects consume
generated types via InternalsVisibleTo.
Key changes:
Infrastructure:
- Add FEATURE_WINDOWSINTEROP compile-time gate in
Directory.BeforeCommon.targets, disabled for source-only builds
- Add CsWin32 (Microsoft.Windows.CsWin32) and PolySharp package
references to Framework.csproj
- Add NativeMethods.txt and NativeMethods.json for CsWin32
configuration
- Add .editorconfig suppression for CS3019 in Windows/ folders
- Add cswin32-interop skill document for agent guidance
New utility types:
- BufferScope<T>: ref struct for stack-allocated buffers with
ArrayPool fallback, replacing manual ArrayPool rent/return and
stackalloc patterns throughout interop code
- TypeInfo<T>: cached RuntimeHelpers.IsReferenceOrContainsReferences
polyfill for net472
- ComScope<T>: COM pointer lifetime management (using-disposable)
- ComClassFactory: AOT-compatible COM activation without
Activator.CreateInstance
- IID: generic IID lookup via IComIID interface
- VARIANT, BSTR, HRESULT, FILETIME extensions: CsWin32 partial
type augmentations for safe usage patterns
NativeMethods.cs cleanup:
- Delete ~130 hand-written constants, enums, structs, and
[DllImport] declarations replaced by CsWin32 typed equivalents
- Replace MemoryStatus class with MEMORYSTATUSEX struct via
TryGetMemoryStatus
- Replace GetCurrentDirectory/GetFullPath/GetShortPathName/
GetLongPathName with BufferScope-based implementations
- Replace StreamHandleType enum with bool useStandardError parameter
- Delete DirectoryExists/FileExists/FileOrDirectoryExists wrappers;
callers use PInvoke.GetFileAttributes directly
- Delete HResultSucceeded/HResultFailed; callers use HRESULT.Failed
- Add [UnsupportedOSPlatformGuard("windows")] to IsUnixLike
- Version all [SupportedOSPlatform] attributes to "windows6.1"
COM interop modernization:
- Replace [ComImport] WMI interfaces (IWbemLocator, IWbemServices,
IEnumWbemClassObject, IWbemClassObject) with struct-based COM
using delegate* unmanaged[Stdcall] vtables
- Add IDebugClient4-based command line retrieval as alternative to
WMI (CommandLineSource.DebugEngine)
- Move IFixedTypeInfo from NativeMethods.cs to its own file in Tasks
Platform guard pattern:
- Apply dual-guard pattern: #if FEATURE_WINDOWSINTEROP wraps runtime
IsWindows checks, eliminating dead code in source builds
- Use IsUnixLike (positive guard) instead of !IsWindows for CA1416
platform compatibility analysis
- Files excluded at project level via <Compile Remove> when
FeatureWindowsInterop is off (WindowsFileSystem, Windows/ folder,
WMI structs)
Behavioral changes:
- Unzip.cs: !IsWindows → IsUnixLike for correct CA1416 analysis
- ProcessExtensions: ArrayPool<T> replaced with BufferScope<T>
- AssemblyInformation/GlobalAssemblyCache: remove dead non-Windows
code paths in net472-only (#if !FEATURE_ASSEMBLYLOADCONTEXT)
blocks, which only run on Windows
Some related changes are also in dotnet#13426.
Replace hand-written [DllImport] declarations and COM [ComImport]
interfaces with CsWin32 source-generated interop. This enables
Native AOT scenarios where built-in COM marshalling and
System.Runtime.InteropServices marshalling infrastructure do not
exist. The approach follows patterns established in WinForms, WPF,
and the dotnet/sdk repository.
CsWin32 is configured in Microsoft.Build.Framework with
allowMarshaling: false and useSafeHandles: false, producing raw
pointer signatures that are AOT-compatible. Other projects consume
generated types via InternalsVisibleTo.
Key changes:
Infrastructure:
- Add FEATURE_WINDOWSINTEROP compile-time gate in
Directory.BeforeCommon.targets, disabled for source-only builds
- Add CsWin32 (Microsoft.Windows.CsWin32) and PolySharp package
references to Framework.csproj
- Add NativeMethods.txt and NativeMethods.json for CsWin32
configuration
- Add .editorconfig suppression for CS3019 in Windows/ folders
- Add cswin32-interop skill document for agent guidance
New utility types:
- BufferScope<T>: ref struct for stack-allocated buffers with
ArrayPool fallback, replacing manual ArrayPool rent/return and
stackalloc patterns throughout interop code
- TypeInfo<T>: cached RuntimeHelpers.IsReferenceOrContainsReferences
polyfill for net472
- ComScope<T>: COM pointer lifetime management (using-disposable)
- ComClassFactory: AOT-compatible COM activation without
Activator.CreateInstance
- IID: generic IID lookup via IComIID interface
- VARIANT, BSTR, HRESULT, FILETIME extensions: CsWin32 partial
type augmentations for safe usage patterns
NativeMethods.cs cleanup:
- Delete ~130 hand-written constants, enums, structs, and
[DllImport] declarations replaced by CsWin32 typed equivalents
- Replace MemoryStatus class with MEMORYSTATUSEX struct via
TryGetMemoryStatus
- Replace GetCurrentDirectory/GetFullPath/GetShortPathName/
GetLongPathName with BufferScope-based implementations
- Replace StreamHandleType enum with bool useStandardError parameter
- Delete DirectoryExists/FileExists/FileOrDirectoryExists wrappers;
callers use PInvoke.GetFileAttributes directly
- Delete HResultSucceeded/HResultFailed; callers use HRESULT.Failed
- Add [UnsupportedOSPlatformGuard("windows")] to IsUnixLike
- Version all [SupportedOSPlatform] attributes to "windows6.1"
COM interop modernization:
- Replace [ComImport] WMI interfaces (IWbemLocator, IWbemServices,
IEnumWbemClassObject, IWbemClassObject) with struct-based COM
using delegate* unmanaged[Stdcall] vtables
- Add IDebugClient4-based command line retrieval as alternative to
WMI (CommandLineSource.DebugEngine)
- Move IFixedTypeInfo from NativeMethods.cs to its own file in Tasks
Platform guard pattern:
- Apply dual-guard pattern: #if FEATURE_WINDOWSINTEROP wraps runtime
IsWindows checks, eliminating dead code in source builds
- Use IsUnixLike (positive guard) instead of !IsWindows for CA1416
platform compatibility analysis
- Files excluded at project level via <Compile Remove> when
FeatureWindowsInterop is off (WindowsFileSystem, Windows/ folder,
WMI structs)
Behavioral changes:
- Unzip.cs: !IsWindows → IsUnixLike for correct CA1416 analysis
- ProcessExtensions: ArrayPool<T> replaced with BufferScope<T>
- AssemblyInformation/GlobalAssemblyCache: remove dead non-Windows
code paths in net472-only (#if !FEATURE_ASSEMBLYLOADCONTEXT)
blocks, which only run on Windows
Some related changes are also in dotnet#13426.
Replace hand-written [DllImport] declarations and COM [ComImport]
interfaces with CsWin32 source-generated interop. This enables
Native AOT scenarios where built-in COM marshalling and
System.Runtime.InteropServices marshalling infrastructure do not
exist. The approach follows patterns established in WinForms, WPF,
and the dotnet/sdk repository.
CsWin32 is configured in Microsoft.Build.Framework with
allowMarshaling: false and useSafeHandles: false, producing raw
pointer signatures that are AOT-compatible. Other projects consume
generated types via InternalsVisibleTo.
Key changes:
Infrastructure:
- Add FEATURE_WINDOWSINTEROP compile-time gate in
Directory.BeforeCommon.targets, disabled for source-only builds
- Add CsWin32 (Microsoft.Windows.CsWin32) and PolySharp package
references to Framework.csproj
- Add NativeMethods.txt and NativeMethods.json for CsWin32
configuration
- Add .editorconfig suppression for CS3019 in Windows/ folders
- Add cswin32-interop skill document for agent guidance
New utility types:
- BufferScope<T>: ref struct for stack-allocated buffers with
ArrayPool fallback, replacing manual ArrayPool rent/return and
stackalloc patterns throughout interop code
- TypeInfo<T>: cached RuntimeHelpers.IsReferenceOrContainsReferences
polyfill for net472
- ComScope<T>: COM pointer lifetime management (using-disposable)
- ComClassFactory: AOT-compatible COM activation without
Activator.CreateInstance
- IID: generic IID lookup via IComIID interface
- VARIANT, BSTR, HRESULT, FILETIME extensions: CsWin32 partial
type augmentations for safe usage patterns
NativeMethods.cs cleanup:
- Delete ~130 hand-written constants, enums, structs, and
[DllImport] declarations replaced by CsWin32 typed equivalents
- Replace MemoryStatus class with MEMORYSTATUSEX struct via
TryGetMemoryStatus
- Replace GetCurrentDirectory/GetFullPath/GetShortPathName/
GetLongPathName with BufferScope-based implementations
- Replace StreamHandleType enum with bool useStandardError parameter
- Delete DirectoryExists/FileExists/FileOrDirectoryExists wrappers;
callers use PInvoke.GetFileAttributes directly
- Delete HResultSucceeded/HResultFailed; callers use HRESULT.Failed
- Add [UnsupportedOSPlatformGuard("windows")] to IsUnixLike
- Version all [SupportedOSPlatform] attributes to "windows6.1"
COM interop modernization:
- Replace [ComImport] WMI interfaces (IWbemLocator, IWbemServices,
IEnumWbemClassObject, IWbemClassObject) with struct-based COM
using delegate* unmanaged[Stdcall] vtables
- Add IDebugClient4-based command line retrieval as alternative to
WMI (CommandLineSource.DebugEngine)
- Move IFixedTypeInfo from NativeMethods.cs to its own file in Tasks
Platform guard pattern:
- Apply dual-guard pattern: #if FEATURE_WINDOWSINTEROP wraps runtime
IsWindows checks, eliminating dead code in source builds
- Use IsUnixLike (positive guard) instead of !IsWindows for CA1416
platform compatibility analysis
- Files excluded at project level via <Compile Remove> when
FeatureWindowsInterop is off (WindowsFileSystem, Windows/ folder,
WMI structs)
Behavioral changes:
- Unzip.cs: !IsWindows → IsUnixLike for correct CA1416 analysis
- ProcessExtensions: ArrayPool<T> replaced with BufferScope<T>
- AssemblyInformation/GlobalAssemblyCache: remove dead non-Windows
code paths in net472-only (#if !FEATURE_ASSEMBLYLOADCONTEXT)
blocks, which only run on Windows
Some related changes are also in dotnet#13426.
This fixes a few issues in RefArrayBuilder - The Grow calls in AddRangeCore and InsertRangeCore specify a larger size argument than needed. In both causes, the size should be scope.Length. - Grow copies more elements then needed to the new buffer when startIndex >= 0. It only needs to copy the items from startIndex up to _count. - InsertRange and InsertRangeCore incorrectly used 'index' rather than 'count' when determining whether to grow the buffer.
Resolved all issues raised.
This PR rewrites and optimizes `EscapingUtilities.Escape` and
`EscapingUtilities.UnescapeAll` — two methods called heavily throughout
MSBuild evaluation — to reduce allocations and improve throughput on
both .NET and .NET Framework 4.7.2.
The largest wins are in `Escape` (especially the no-special-chars fast
path) and strings with invalid escape sequences, with speed improvements
of up to **3.3× on .NET 10.0** and **3.0× on .NET Framework 4.8.1**, and
the `UnescapeAll` allocation for strings with invalid escape sequences
**eliminated entirely** on both runtimes.
## Highlights
### 🚀 Speed Improvements (.NET 10.0)
| Benchmark | Before | After | Speedup |
| --- | --- | --- | --- |
| `Escape_NoSpecialChars` | 17.96 ns | 5.37 ns | 3.3× |
| `Escape_FewSpecialChars` | 191.8 ns | 84.9 ns | 2.3× |
| `Escape_ManySpecialChars` | 892.0 ns | 391.9 ns | 2.3× |
| `UnescapeAll_InvalidEscapeSequences` | 38.5 ns | 19.6 ns | 2.0× |
| `EscapeWithCaching_FewSpecialChars` | 48.9 ns | 27.9 ns | 1.8× |
| `EscapeWithCaching_ManySpecialChars` | 50.0 ns | 27.7 ns | 1.8× |
| `RoundTrip_EscapeThenUnescape` | 245.3 ns | 148.4 ns | 1.7× |
### 🧹 Allocation Reductions (.NET 10.0)
| Benchmark | Before | After | Reduction |
| --- | --- | --- | --- |
| `UnescapeAll_InvalidEscapeSequences` | 48 B | 0 B | 100% |
### 📊 .NET Framework 4.8.1
| Benchmark | Before | After | Speedup |
| --- | --- | --- | --- |
| `UnescapeAll_InvalidEscapeSequences` | 89.7 ns | 30.3 ns | 3.0× |
| `Escape_NoSpecialChars` | 79.9 ns | 36.6 ns | 2.2× |
| `Escape_ManySpecialChars` | 1,073.6 ns | 463.7 ns | 2.3× |
| `Escape_FewSpecialChars` | 349.5 ns | 198.6 ns | 1.8× |
| `RoundTrip_EscapeThenUnescape` | 541.7 ns | 371.0 ns | 1.5× |
The `UnescapeAll_InvalidEscapeSequences` allocation is also eliminated
on .NET Framework (40 B → 0 B, 100% reduction).
# Detailed Summary of Changes
## Infrastructure
- Add `BufferScope<T>`: a new ref struct that manages a stack-allocated
buffer with `ArrayPool<T>` fallback for heap overflow, enabling
low-allocation temporary storage.
- Add `RefArrayBuilder<T>`: a new ref struct for cheaply building arrays
using `BufferScope<T>`, used by the Escape two-pass algorithm.
- Add benchmarks for `EscapingUtilities`: BenchmarkDotNet benchmarks
covering `Escape` and `UnescapeAll` across typical MSBuild string
patterns.
## `Escape` optimizations
- Use `SearchValues<char>` on .NET: replaces the `IndexOfAny(char[])`
scan with a `SearchValues<char>`-based scan, enabling vectorized SIMD
search.
- Replace `IndexOfAny` with a bitmask scan on .NET Framework: all 9
escapable characters fall in the ASCII range `['$'...'@']`, so a 29-bit
bitmask replaces the O(k) per-character array scan with a single range
check + bit test.
- Reuse first `IndexOfAnyEscapeChar` result: avoids a redundant scan
pass by feeding the initial hit directly into the collection loop.
- Refactor to a two-pass direct string allocation: a first pass collects
all special character indices into a `RefArrayBuilder<int>`, then a
second pass allocates a single exact-sized string (via `string.Create`
on .NET; unsafe fixed+`Buffer.MemoryCopy` on .NET Framework),
eliminating the `StringBuilder` entirely.
## `UnescapeAll` optimizations
- Acquire `StringBuilder` lazily: the `StringBuilder` is only rented
from `StringBuilderCache` when the first decodable `%XX` sequence is
actually found, so strings with `%` signs that aren't valid escape
sequences return the original string with zero allocations.
- Scope `IndexOf('%')` to the active trim window: when trim: true, the
percent-sign search is bounded to `[startIndex, endIndex)` computed from
the trimmed range, avoiding scanning whitespace that will be discarded.
- Replace `TryDecodeHexDigit` arithmetic with `HexConverter` lookup
table: delegates to the internal `HexConverter.FromChar` lookup table,
replacing the multi-branch switch with a single table lookup.
## Test improvements
- Clean up `EscapingUtilities_Tests`: migrated to
`[Theory]`/`[InlineData]` with `Shouldly` assertions and removed
redundant test helpers.
- Add more test coverage: expanded cases for `UnescapeAll`,
`UnescapeAll(trim: true)`, `Escape`, round-trip `Escape`↔`Unescape`, and
`ContainsEscapedWildcards`.
- Clean up `EscapingUtilities` in preparation for performance work:
modernized XML docs, renamed parameters for clarity, and tightened
nullability annotations ahead of the algorithmic changes.
# Benchmarks
## Initial Results (baselines)
### .NET 10.0
| Method | Mean | Error | StdDev | Gen0 | Allocated |
|-----------------------------------------
|-----------:|-----------:|-----------:|-------:|----------:|
| UnescapeAll_NoSpecialChars | 2.612 ns | 0.0785 ns | 0.0696 ns | - | -
|
| UnescapeAll_FewEscapeSequences | 88.216 ns | 0.7230 ns | 0.6409 ns |
0.0153 | 96 B |
| UnescapeAll_ManyEscapeSequences | 176.489 ns | 0.6002 ns | 0.5012 ns |
0.0100 | 64 B |
| UnescapeAll_InvalidEscapeSequences | 38.494 ns | 0.4281 ns | 0.3795 ns
| 0.0076 | 48 B |
| UnescapeAll_WithTrim | 56.222 ns | 1.1704 ns | 2.4171 ns | 0.0076 | 48
B |
| Escape_NoSpecialChars | 17.957 ns | 0.4183 ns | 0.4108 ns | - | - |
| Escape_FewSpecialChars | 191.810 ns | 2.7928 ns | 2.4758 ns | 0.0176 |
112 B |
| Escape_ManySpecialChars | 892.012 ns | 16.6366 ns | 15.5619 ns |
0.0324 | 208 B |
| EscapeWithCaching_FewSpecialChars | 48.888 ns | 0.9908 ns | 0.9731 ns
| - | - |
| EscapeWithCaching_ManySpecialChars | 49.986 ns | 0.9944 ns | 1.3611 ns
| - | - |
| ContainsEscapedWildcards_NoPercent | 2.009 ns | 0.0231 ns | 0.0216 ns
| - | - |
| ContainsEscapedWildcards_HasWildcards | 2.895 ns | 0.0433 ns | 0.0405
ns | - | - |
| ContainsEscapedWildcards_LongNoWildcards | 20.120 ns | 0.1630 ns |
0.1525 ns | - | - |
| RoundTrip_EscapeThenUnescape | 245.305 ns | 2.0098 ns | 1.8799 ns |
0.0329 | 208 B |
### .NET Framework 4.8.1
| Method | Mean | Error | StdDev | Gen0 | Allocated |
|-----------------------------------------
|-------------:|-----------:|-----------:|-------:|----------:|
| UnescapeAll_NoSpecialChars | 19.242 ns | 0.2443 ns | 0.2285 ns | - | -
|
| UnescapeAll_FewEscapeSequences | 182.132 ns | 1.5881 ns | 1.4855 ns |
0.0160 | 84 B |
| UnescapeAll_ManyEscapeSequences | 314.347 ns | 2.0764 ns | 1.9422 ns |
0.0105 | 56 B |
| UnescapeAll_InvalidEscapeSequences | 89.696 ns | 0.6176 ns | 0.5777 ns
| 0.0076 | 40 B |
| UnescapeAll_WithTrim | 119.322 ns | 0.8753 ns | 0.8188 ns | 0.0067 |
36 B |
| Escape_NoSpecialChars | 79.902 ns | 0.5406 ns | 0.4793 ns | - | - |
| Escape_FewSpecialChars | 349.474 ns | 2.9539 ns | 2.7631 ns | 0.0191 |
100 B |
| Escape_ManySpecialChars | 1,073.623 ns | 5.7130 ns | 5.3440 ns |
0.0362 | 196 B |
| EscapeWithCaching_FewSpecialChars | 75.406 ns | 0.5825 ns | 0.5449 ns
| - | - |
| EscapeWithCaching_ManySpecialChars | 63.058 ns | 0.6023 ns | 0.5634 ns
| - | - |
| ContainsEscapedWildcards_NoPercent | 18.369 ns | 0.1220 ns | 0.1081 ns
| - | - |
| ContainsEscapedWildcards_HasWildcards | 9.071 ns | 0.0870 ns | 0.0814
ns | - | - |
| ContainsEscapedWildcards_LongNoWildcards | 56.651 ns | 1.1451 ns |
1.1247 ns | - | - |
| RoundTrip_EscapeThenUnescape | 541.701 ns | 10.4909 ns | 11.6606 ns |
0.0343 | 184 B |
## Final Results
### .NET 10.0
| Method | Mean | Error | StdDev | Gen0 | Allocated |
|-----------------------------------------
|-----------:|----------:|----------:|-------:|----------:|
| UnescapeAll_NoSpecialChars | 2.705 ns | 0.0772 ns | 0.0722 ns | - | -
|
| UnescapeAll_FewEscapeSequences | 83.407 ns | 1.7204 ns | 1.6897 ns |
0.0153 | 96 B |
| UnescapeAll_ManyEscapeSequences | 176.586 ns | 1.5432 ns | 1.4435 ns |
0.0100 | 64 B |
| UnescapeAll_InvalidEscapeSequences | 19.590 ns | 0.1503 ns | 0.1333 ns
| - | - |
| UnescapeAll_WithTrim | 56.048 ns | 0.8981 ns | 0.8400 ns | 0.0076 | 48
B |
| Escape_NoSpecialChars | 5.368 ns | 0.0523 ns | 0.0489 ns | - | - |
| Escape_FewSpecialChars | 84.876 ns | 0.9482 ns | 0.8870 ns | 0.0178 |
112 B |
| Escape_ManySpecialChars | 391.947 ns | 2.9983 ns | 2.8047 ns | 0.0329
| 208 B |
| EscapeWithCaching_FewSpecialChars | 27.894 ns | 0.3138 ns | 0.2935 ns
| - | - |
| EscapeWithCaching_ManySpecialChars | 27.667 ns | 0.1934 ns | 0.1809 ns
| - | - |
| ContainsEscapedWildcards_NoPercent | 1.990 ns | 0.0231 ns | 0.0216 ns
| - | - |
| ContainsEscapedWildcards_HasWildcards | 3.982 ns | 0.0205 ns | 0.0182
ns | - | - |
| ContainsEscapedWildcards_LongNoWildcards | 19.671 ns | 0.1140 ns |
0.1066 ns | - | - |
| RoundTrip_EscapeThenUnescape | 148.380 ns | 1.9688 ns | 1.8416 ns |
0.0331 | 208 B |
### .NET Framework 4.8.1
| Method | Mean | Error | StdDev | Gen0 | Allocated |
|-----------------------------------------
|-----------:|----------:|----------:|-------:|----------:|
| UnescapeAll_NoSpecialChars | 19.585 ns | 0.1821 ns | 0.1520 ns | - | -
|
| UnescapeAll_FewEscapeSequences | 175.258 ns | 1.3275 ns | 1.1768 ns |
0.0160 | 84 B |
| UnescapeAll_ManyEscapeSequences | 299.969 ns | 1.9503 ns | 1.7289 ns |
0.0105 | 56 B |
| UnescapeAll_InvalidEscapeSequences | 30.317 ns | 0.4839 ns | 0.4527 ns
| - | - |
| UnescapeAll_WithTrim | 115.244 ns | 1.0802 ns | 1.0104 ns | 0.0067 |
36 B |
| Escape_NoSpecialChars | 36.594 ns | 0.3431 ns | 0.3209 ns | - | - |
| Escape_FewSpecialChars | 198.554 ns | 1.2490 ns | 0.9751 ns | 0.0191 |
100 B |
| Escape_ManySpecialChars | 463.734 ns | 4.5158 ns | 4.0031 ns | 0.0372
| 196 B |
| EscapeWithCaching_FewSpecialChars | 61.413 ns | 1.0921 ns | 1.0215 ns
| - | - |
| EscapeWithCaching_ManySpecialChars | 51.776 ns | 0.8385 ns | 0.7843 ns
| - | - |
| ContainsEscapedWildcards_NoPercent | 18.567 ns | 0.0863 ns | 0.0720 ns
| - | - |
| ContainsEscapedWildcards_HasWildcards | 9.044 ns | 0.0935 ns | 0.0829
ns | - | - |
| ContainsEscapedWildcards_LongNoWildcards | 54.810 ns | 0.3122 ns |
0.2767 ns | - | - |
| RoundTrip_EscapeThenUnescape | 371.026 ns | 5.3404 ns | 4.9954 ns |
0.0348 | 184 B |
Replace hand-written [DllImport] declarations and COM [ComImport]
interfaces with CsWin32 source-generated interop. This enables
Native AOT scenarios where built-in COM marshalling and
System.Runtime.InteropServices marshalling infrastructure do not
exist. The approach follows patterns established in WinForms, WPF,
and the dotnet/sdk repository.
CsWin32 is configured in Microsoft.Build.Framework with
allowMarshaling: false and useSafeHandles: false, producing raw
pointer signatures that are AOT-compatible. Other projects consume
generated types via InternalsVisibleTo.
Key changes:
Infrastructure:
- Add FEATURE_WINDOWSINTEROP compile-time gate in
Directory.BeforeCommon.targets, disabled for source-only builds
- Add CsWin32 (Microsoft.Windows.CsWin32) and PolySharp package
references to Framework.csproj
- Add NativeMethods.txt and NativeMethods.json for CsWin32
configuration
- Add .editorconfig suppression for CS3019 in Windows/ folders
- Add cswin32-interop skill document for agent guidance
New utility types:
- BufferScope<T>: ref struct for stack-allocated buffers with
ArrayPool fallback, replacing manual ArrayPool rent/return and
stackalloc patterns throughout interop code
- TypeInfo<T>: cached RuntimeHelpers.IsReferenceOrContainsReferences
polyfill for net472
- ComScope<T>: COM pointer lifetime management (using-disposable)
- ComClassFactory: AOT-compatible COM activation without
Activator.CreateInstance
- IID: generic IID lookup via IComIID interface
- VARIANT, BSTR, HRESULT, FILETIME extensions: CsWin32 partial
type augmentations for safe usage patterns
NativeMethods.cs cleanup:
- Delete ~130 hand-written constants, enums, structs, and
[DllImport] declarations replaced by CsWin32 typed equivalents
- Replace MemoryStatus class with MEMORYSTATUSEX struct via
TryGetMemoryStatus
- Replace GetCurrentDirectory/GetFullPath/GetShortPathName/
GetLongPathName with BufferScope-based implementations
- Replace StreamHandleType enum with bool useStandardError parameter
- Delete DirectoryExists/FileExists/FileOrDirectoryExists wrappers;
callers use PInvoke.GetFileAttributes directly
- Delete HResultSucceeded/HResultFailed; callers use HRESULT.Failed
- Add [UnsupportedOSPlatformGuard("windows")] to IsUnixLike
- Version all [SupportedOSPlatform] attributes to "windows6.1"
COM interop modernization:
- Replace [ComImport] WMI interfaces (IWbemLocator, IWbemServices,
IEnumWbemClassObject, IWbemClassObject) with struct-based COM
using delegate* unmanaged[Stdcall] vtables
- Add IDebugClient4-based command line retrieval as alternative to
WMI (CommandLineSource.DebugEngine)
- Move IFixedTypeInfo from NativeMethods.cs to its own file in Tasks
Platform guard pattern:
- Apply dual-guard pattern: #if FEATURE_WINDOWSINTEROP wraps runtime
IsWindows checks, eliminating dead code in source builds
- Use IsUnixLike (positive guard) instead of !IsWindows for CA1416
platform compatibility analysis
- Files excluded at project level via <Compile Remove> when
FeatureWindowsInterop is off (WindowsFileSystem, Windows/ folder,
WMI structs)
Behavioral changes:
- Unzip.cs: !IsWindows → IsUnixLike for correct CA1416 analysis
- ProcessExtensions: ArrayPool<T> replaced with BufferScope<T>
- AssemblyInformation/GlobalAssemblyCache: remove dead non-Windows
code paths in net472-only (#if !FEATURE_ASSEMBLYLOADCONTEXT)
blocks, which only run on Windows
Some related changes are also in dotnet#13426.
Replace hand-written [DllImport] declarations and COM [ComImport]
interfaces with CsWin32 source-generated interop. This enables
Native AOT scenarios where built-in COM marshalling and
System.Runtime.InteropServices marshalling infrastructure do not
exist. The approach follows patterns established in WinForms, WPF,
and the dotnet/sdk repository.
CsWin32 is configured in Microsoft.Build.Framework with
allowMarshaling: false and useSafeHandles: false, producing raw
pointer signatures that are AOT-compatible. Other projects consume
generated types via InternalsVisibleTo.
Key changes:
Infrastructure:
- Add FEATURE_WINDOWSINTEROP compile-time gate in
Directory.BeforeCommon.targets, disabled for source-only builds
- Add CsWin32 (Microsoft.Windows.CsWin32) and PolySharp package
references to Framework.csproj
- Add NativeMethods.txt and NativeMethods.json for CsWin32
configuration
- Add .editorconfig suppression for CS3019 in Windows/ folders
- Add cswin32-interop skill document for agent guidance
New utility types:
- BufferScope<T>: ref struct for stack-allocated buffers with
ArrayPool fallback, replacing manual ArrayPool rent/return and
stackalloc patterns throughout interop code
- TypeInfo<T>: cached RuntimeHelpers.IsReferenceOrContainsReferences
polyfill for net472
- ComScope<T>: COM pointer lifetime management (using-disposable)
- ComClassFactory: AOT-compatible COM activation without
Activator.CreateInstance
- IID: generic IID lookup via IComIID interface
- VARIANT, BSTR, HRESULT, FILETIME extensions: CsWin32 partial
type augmentations for safe usage patterns
NativeMethods.cs cleanup:
- Delete ~130 hand-written constants, enums, structs, and
[DllImport] declarations replaced by CsWin32 typed equivalents
- Replace MemoryStatus class with MEMORYSTATUSEX struct via
TryGetMemoryStatus
- Replace GetCurrentDirectory/GetFullPath/GetShortPathName/
GetLongPathName with BufferScope-based implementations
- Replace StreamHandleType enum with bool useStandardError parameter
- Delete DirectoryExists/FileExists/FileOrDirectoryExists wrappers;
callers use PInvoke.GetFileAttributes directly
- Delete HResultSucceeded/HResultFailed; callers use HRESULT.Failed
- Add [UnsupportedOSPlatformGuard("windows")] to IsUnixLike
- Version all [SupportedOSPlatform] attributes to "windows6.1"
COM interop modernization:
- Replace [ComImport] WMI interfaces (IWbemLocator, IWbemServices,
IEnumWbemClassObject, IWbemClassObject) with struct-based COM
using delegate* unmanaged[Stdcall] vtables
- Add IDebugClient4-based command line retrieval as alternative to
WMI (CommandLineSource.DebugEngine)
- Move IFixedTypeInfo from NativeMethods.cs to its own file in Tasks
Platform guard pattern:
- Apply dual-guard pattern: #if FEATURE_WINDOWSINTEROP wraps runtime
IsWindows checks, eliminating dead code in source builds
- Use IsUnixLike (positive guard) instead of !IsWindows for CA1416
platform compatibility analysis
- Files excluded at project level via <Compile Remove> when
FeatureWindowsInterop is off (WindowsFileSystem, Windows/ folder,
WMI structs)
Behavioral changes:
- Unzip.cs: !IsWindows → IsUnixLike for correct CA1416 analysis
- ProcessExtensions: ArrayPool<T> replaced with BufferScope<T>
- AssemblyInformation/GlobalAssemblyCache: remove dead non-Windows
code paths in net472-only (#if !FEATURE_ASSEMBLYLOADCONTEXT)
blocks, which only run on Windows
Some related changes are also in dotnet#13426.
Tip
I recommend reviewing this pull request commit-by-commit. I made sure that each commit is distinct, cohesive and has a detailed description.
This PR rewrites and optimizes
EscapingUtilities.EscapeandEscapingUtilities.UnescapeAll— two methods called heavily throughout MSBuild evaluation — to reduce allocations and improve throughput on both .NET and .NET Framework 4.7.2.The largest wins are in
Escape(especially the no-special-chars fast path) and strings with invalid escape sequences, with speed improvements of up to 3.3× on .NET 10.0 and 3.0× on .NET Framework 4.8.1, and theUnescapeAllallocation for strings with invalid escape sequences eliminated entirely on both runtimes.Highlights
🚀 Speed Improvements (.NET 10.0)
Escape_NoSpecialCharsEscape_FewSpecialCharsEscape_ManySpecialCharsUnescapeAll_InvalidEscapeSequencesEscapeWithCaching_FewSpecialCharsEscapeWithCaching_ManySpecialCharsRoundTrip_EscapeThenUnescape🧹 Allocation Reductions (.NET 10.0)
UnescapeAll_InvalidEscapeSequences📊 .NET Framework 4.8.1
UnescapeAll_InvalidEscapeSequencesEscape_NoSpecialCharsEscape_ManySpecialCharsEscape_FewSpecialCharsRoundTrip_EscapeThenUnescapeThe
UnescapeAll_InvalidEscapeSequencesallocation is also eliminated on .NET Framework (40 B → 0 B, 100% reduction).Detailed Summary of Changes
Infrastructure
BufferScope<T>: a new ref struct that manages a stack-allocated buffer withArrayPool<T>fallback for heap overflow, enabling low-allocation temporary storage.RefArrayBuilder<T>: a new ref struct for cheaply building arrays usingBufferScope<T>, used by the Escape two-pass algorithm.EscapingUtilities: BenchmarkDotNet benchmarks coveringEscapeandUnescapeAllacross typical MSBuild string patterns.EscapeoptimizationsSearchValues<char>on .NET: replaces theIndexOfAny(char[])scan with aSearchValues<char>-based scan, enabling vectorized SIMD search.IndexOfAnywith a bitmask scan on .NET Framework: all 9 escapable characters fall in the ASCII range['$'...'@'], so a 29-bit bitmask replaces the O(k) per-character array scan with a single range check + bit test.IndexOfAnyEscapeCharresult: avoids a redundant scan pass by feeding the initial hit directly into the collection loop.RefArrayBuilder<int>, then a second pass allocates a single exact-sized string (viastring.Createon .NET; unsafe fixed+Buffer.MemoryCopyon .NET Framework), eliminating theStringBuilderentirely.UnescapeAlloptimizationsStringBuilderlazily: theStringBuilderis only rented fromStringBuilderCachewhen the first decodable%XXsequence is actually found, so strings with%signs that aren't valid escape sequences return the original string with zero allocations.IndexOf('%')to the active trim window: when trim: true, the percent-sign search is bounded to[startIndex, endIndex)computed from the trimmed range, avoiding scanning whitespace that will be discarded.TryDecodeHexDigitarithmetic withHexConverterlookup table: delegates to the internalHexConverter.FromCharlookup table, replacing the multi-branch switch with a single table lookup.Test improvements
EscapingUtilities_Tests: migrated to[Theory]/[InlineData]withShouldlyassertions and removed redundant test helpers.UnescapeAll,UnescapeAll(trim: true),Escape, round-tripEscape↔Unescape, andContainsEscapedWildcards.EscapingUtilitiesin preparation for performance work: modernized XML docs, renamed parameters for clarity, and tightened nullability annotations ahead of the algorithmic changes.Benchmarks
Initial Results (baselines)
.NET 10.0
.NET Framework 4.8.1
Final Results
.NET 10.0
.NET Framework 4.8.1