perf: eliminate locks from mock invocation and verification hot paths#5422
perf: eliminate locks from mock invocation and verification hot paths#5422
Conversation
Not up to standards ⛔🔴 Issues
|
| Category | Results |
|---|---|
| CodeStyle | 3 minor |
🟢 Metrics 25 complexity
Metric Results Complexity 25
TIP This summary will be updated as you push new changes. Give us feedback
There was a problem hiding this comment.
Code Review: perf/mock-engine-lock-free-hot-path
The motivation here is clear and the benchmark data is well-presented. The approach — per-buffer locks, CAS for single-behavior setup, and single-pass verification — is the right direction. However, there are several correctness issues that need addressing before merge.
🚨 Critical: Bounds mismatch between calls and counts arrays
File: MockEngine.cs — StoreCallRecord fast path
var calls = _callsByMemberId; // non-volatile read — could be generation N+1
var counts = _callCountByMemberId; // non-volatile read — could be generation N
if (calls is not null && counts is not null && (uint)memberId < (uint)calls.Length)
{
var buffer = calls[memberId];
if (buffer is not null)
{
buffer.Add(record);
Interlocked.Increment(ref counts[memberId]); // OOB if counts is old array
return record;
}
}The bounds check only guards against calls.Length, not counts.Length. EnsureCallArrayCapacity writes _callsByMemberId = newByMember then _callCountByMemberId = newCounts as two separate non-volatile stores. A racing reader could see the new (larger) calls array while still seeing the old (smaller) counts array. If memberId is in range of the new array but not the old, Interlocked.Increment(ref counts[memberId]) throws IndexOutOfRangeException.
Fix: Bundle both arrays into a single struct/record and replace them atomically:
private volatile CallArrays? _callArrays; // single volatile reference swap
private sealed class CallArrays
{
public readonly CallRecordBuffer?[] Calls;
public readonly int[] Counts;
public CallArrays(int size) { Calls = new CallRecordBuffer?[size]; Counts = new int[size]; }
}This eliminates the two-array TOCTOU entirely — a single Volatile.Read gives you a consistent snapshot of both.
🚨 Critical: "Optimistic mark + unmark" corrupts previously-verified records
File: CallVerificationBuilder.cs — WasCalled
var matchingCount = CountAndMarkBuffer(buffer, markVerified: true); // marks ALL matching
if (!times.Matches(matchingCount))
{
CountAndMarkBuffer(buffer, markVerified: false); // unmarks ALL matching — including ones
// verified by a PRIOR WasCalled call!
throw ...;
}CountAndMarkBuffer with markVerified: false unconditionally sets IsVerified = false on every matching record. If a record was already verified by a previous WasCalled assertion, the failed second assertion will un-verify it, causing VerifyNoOtherCalls() to report a false failure.
The original two-pass approach was correct: count first, then mark only if the count passes. Restoring that contract is the right fix here. The allocation of GetCallsFor that this avoids is only on the throw path anyway — the exception itself is far more expensive than an array allocation.
// Safe single-pass approach: count first, never unmark
var matchingCount = CountMatchingBuffer(buffer);
if (!times.Matches(matchingCount))
{
throw new MockVerificationException(...);
}
// Only mark after successful assertion
MarkMatchingBuffer(buffer);⚠️ Significant: CAS race in AddBehavior can leave behaviors split across both fields
File: MethodSetup.cs — AddBehavior
Consider this interleaving:
- Thread A:
Volatile.Read(ref _behaviors) is null→true - Thread B (in
AddBehaviorSlow): promotes to list,Volatile.Write(ref _behaviors, [existing]), thenVolatile.Write(ref _singleBehavior, null) - Thread A: CAS on
_singleBehaviorwith expectednull— succeeds (B just cleared it!) - Thread A returns. Now
_singleBehavioris set AND_behaviorsis non-null.
GetNextBehavior checks _singleBehavior first and returns it early, so the behaviors in _behaviors are silently ignored.
The fix is to re-check _behaviors after a successful CAS, or check both conditions atomically under the lock:
public void AddBehavior(IBehavior behavior)
{
// CAS is only safe if _behaviors is still null after the exchange.
// Re-read _behaviors after CAS to detect the race.
if (Volatile.Read(ref _behaviors) is null &&
Interlocked.CompareExchange(ref _singleBehavior, behavior, null) is null &&
Volatile.Read(ref _behaviors) is null) // double-check after CAS
{
return;
}
AddBehaviorSlow(behavior);
}ℹ️ Minor: lock(this) in CallRecordBuffer
File: CallRecordBuffer.cs — Add
lock(this) on internal sealed class is low-risk since no external code can grab the monitor, but it's a well-known antipattern. A private dedicated _lock object is the idiomatic replacement and costs one extra pointer. This is a style preference — not a blocker.
ℹ️ Minor: _callsByMemberId non-volatile field now accessed without lock
The field was always lock-protected before; the fast path now reads it without Volatile.Read. The JIT is allowed to hoist this load. Marking the field volatile (or using Volatile.Read) restores the happens-before guarantee that the previous lock provided for free.
What's good
- The
CallRecordBufferdesign (pre-allocated, per-member lock, volatile count publish) is solid and the API surface is clean. - Splitting
StoreCallRecordinto fast/slow paths is idiomatic and[MethodImpl(MethodImplOptions.NoInlining)]on the slow path is the right call. - Replacing the shared
MockEngine.Lockwith per-buffer locks is the correct architectural move; the contention analysis in the PR description is accurate. - The
CollectInto/ForEachzero-copy iteration API onCallRecordBufferis a nice improvement over the previousAddRange(list)pattern.
The issues above are real correctness risks in concurrent test runs. The two critical ones (bounds mismatch and optimistic unmark) are straightforward to fix and the overall approach is worth landing once they're addressed.
Code Review: perf: eliminate locks from mock invocation and verification hot pathsOverall this is a well-motivated performance PR with a clear benchmark-driven rationale. The design is largely sound, but there are several correctness and design concerns worth addressing before merge. 1.
|
| Issue | Severity |
|---|---|
GetSnapshot() torn read (_count before _items) |
High — can cause OOBE or stale null slots |
_callsByMemberId missing volatile on lock-free reads |
High — can cause invisible writes on some JIT/CPU combos |
_singleBehavior/_behaviors removed volatile keyword |
Medium — CAS provides ordering for those fields but dependent reads in slow path may not see updates |
Optimistic mark-then-unmark in WasCalled |
Low — narrow race in concurrent teardown |
GetAllCalls/GetUnverifiedCalls lock removal |
Low — acceptable with a documentation note |
The performance direction is correct. Fixing the GetSnapshot() read order and adding volatile back to _callsByMemberId (or using Volatile.Read consistently) would make this safe to merge.
There was a problem hiding this comment.
Code Review: perf/mock-engine-lock-free-hot-path
Summary
This PR replaces TUnit.Mocks' shared MockEngine.Lock in the invocation hot path with three complementary lock-free strategies:
CallArrayswrapper — BundlesCallRecordBuffer?[]andint[]into a singlevolatileobject reference so both arrays are swapped atomically, eliminating the TOCTOU bug from the previous separate volatile array fields.CallRecordBuffer— Per-member append-only buffer with its own small_syncRoot. Writers lock only their buffer; readers take a lock-free snapshot viaVolatile.Read.AddBehaviorCAS fast path — UsesInterlocked.CompareExchangeto avoid allocating and acquiring aLockobject when only a single behavior is registered (the common case).
The prior review comment about the two-array TOCTOU bug was correctly diagnosed, and this PR directly resolves it with the CallArrays wrapper. That fix is solid.
Positive Aspects
CallArrays atomic swap is the right fix. Bundling both arrays into one volatile reference is the textbook pattern for this problem. Since EnsureCallArrayCapacity is always called under Lock, the swap itself is safe, and readers outside the lock get a consistent generation.
CallRecordBuffer.GetSnapshot memory ordering intent is sound. The code attempts to exploit the "writer stores element before incrementing count" invariant, which is a well-known lock-free pattern.
[MethodImpl(MethodImplOptions.NoInlining)] on slow paths keeps hot-path JIT output cleaner and prevents lock-allocation code from being inlined.
Removal of .ToArray() in CallVerificationBuilder on the happy path (when count matches) is a genuine allocation win for the "test passes" case.
Issues Found
1. GetSnapshot read order is inverted — potential torn read on Grow (Medium)
File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 68–70
var items = Volatile.Read(ref _items); // reads items FIRST
var count = Volatile.Read(ref _count); // reads count SECONDThe comment says "Read items first, then count. A writer always updates items before incrementing count." However, consider the Grow path:
Grow()sets_items = newItems(the new, larger array)Addwritesitems[count] = recordAddincrements_count
A reader that snaps _items (pre-grow, smaller array) and then snaps _count (post-grow, larger count) can produce a snapshot where count > items.Length. Passing this to Array.Copy(items, result, count) will throw ArgumentException or silently read stale/garbage entries.
The safe pattern for this classic "array + count" lock-free snapshot is the reverse: read count first, then items. A reader that sees count == N is guaranteed to see an _items that was at least length N at that point.
Alternatively, use Volatile.Write(ref _items, newItems) inside Grow() to publish the new array with a store-release fence before any subsequent count increment is visible.
// Safe: read count first, then items
var count = Volatile.Read(ref _count);
var items = Volatile.Read(ref _items);
return (items, count);2. CAS fast path in AddBehavior has a double-dispatch window (Low)
File: TUnit.Mocks/Setup/MethodSetup.cs, lines 76–87
if (Volatile.Read(ref _behaviors) is null // (A)
&& Interlocked.CompareExchange(ref _singleBehavior, behavior, null) is null) // (B)
{
if (Volatile.Read(ref _behaviors) is null) // (C)
return;
}
AddBehaviorSlow(behavior);Consider this interleaving during concurrent setup:
- Thread 2 runs
AddBehaviorSlowfully: promotes to list, sets_behaviors = [existing], clears_singleBehavior. - Thread 1 passes (A) (saw null), CAS (B) succeeds, but (C) now sees
_behaviors != null. - Thread 1 calls
AddBehaviorSlowand addsbehaviorto the list.
BUT between (B) and (C), _singleBehavior = behavior is briefly visible to GetNextBehavior, which reads it without a lock. A concurrent invocation thread could dispatch behavior during this window, then dispatch it again once it's in _behaviors. In practice, mock setup is usually single-threaded before invocations, but the invariant is wrong. Additionally, AddBehaviorSlow captures _singleBehavior to seed the list — if this CAS value was already captured, behavior could appear twice.
3. CallRecordBuffer uses object lock instead of Lock — inconsistent (Minor)
File: TUnit.Mocks/Verification/CallRecordBuffer.cs, line 16
private readonly object _syncRoot = new();MockEngine uses C# 13's System.Threading.Lock type, which provides better diagnostics and a faster implementation on .NET 9+. CallRecordBuffer's per-buffer lock is acquired on every write (the hot path), so this is the most impactful place to use Lock. Since the project targets net8.0;net9.0;net10.0, changing to Lock would be consistent and deliver additional benefit on newer runtimes.
4. Duplicate <summary> XML doc in CallRecordBuffer.cs (Minor)
File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 56–64
/// <summary>
/// Returns the current items array and count for lock-free iteration.
/// ...
/// </summary>
/// <summary> ← invalid; silently ignored by XML doc processor
/// Read items first, then count. ...
/// </summary>The second <summary> block should be a <remarks> element.
5. _matchers.Length == 0 guard in CountMatchingBuffer is always false at that call site (Cosmetic)
File: TUnit.Mocks/Verification/CallVerificationBuilder.cs, line 117
The _matchers.Length == 0 fast-exit inside the loop is only reached from the else branch of WasCalled, which already handles the zero-matchers case via GetCallCountFor. The condition is always false here, adding cognitive overhead. The guard can be removed or hoisted outside the loop.
Summary Table
| # | Severity | Location | Issue |
|---|---|---|---|
| 1 | Medium | CallRecordBuffer.cs:68-70 |
GetSnapshot reads _items before _count; concurrent Grow can produce torn snapshot with pre-grow array and post-grow count |
| 2 | Low | MethodSetup.cs:76-87 |
CAS fast path window where _singleBehavior is set but _behaviors is already a list; concurrent reader could double-dispatch |
| 3 | Minor | CallRecordBuffer.cs:16 |
Uses object lock instead of Lock (C# 13), inconsistent with rest of codebase |
| 4 | Minor | CallRecordBuffer.cs:56-64 |
Duplicate <summary> tags; second should be <remarks> |
| 5 | Cosmetic | CallVerificationBuilder.cs:117 |
_matchers.Length == 0 check in loop is always false at this call site |
Recommendation
The core architectural change (atomic CallArrays swap, per-buffer locking, lock-free single-behavior CAS) is sound and a meaningful improvement over the previous lock-everywhere approach. Issue #1 (GetSnapshot read order) is the only one that affects memory safety correctness and should be fixed before merge — the Grow path creates a subtle torn-read hazard. Issues #2 and #3 are worth addressing. The rest are cosmetic.
There was a problem hiding this comment.
Follow-up Review: perf/mock-engine-lock-free-hot-path
Thank you for the four follow-up commits. Most of the previously raised issues have been addressed correctly. Here is a status summary and the two remaining issues.
Issues Addressed Since Prior Reviews
| Prior Issue | Status |
|---|---|
Two-array TOCTOU (_callsByMemberId / _callCountByMemberId) |
Fixed — bundled into CallArrays with a single volatile reference swap |
| Optimistic mark-then-unmark corrupting prior verifications | Fixed — restored count-first, mark-on-success pattern |
GetSnapshot() torn read (count before items) |
Fixed — now reads _items before _count, consistent with writer ordering |
lock(this) in CallRecordBuffer |
Fixed — replaced with private _syncRoot field |
CAS race in AddBehavior when _behaviors is promoted concurrently |
Fixed — double-check after CAS (Volatile.Read(ref _behaviors) is null on re-entry) |
_callsByMemberId non-volatile fast-path read |
Fixed — consolidated into CallArrays which is volatile |
Duplicate <summary> XML tag in CallRecordBuffer |
Not fixed — still present (minor, see below) |
Remaining Issue 1: GetSnapshot() read order is still the Grow-unsafe order (Medium)
File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 68–74
The comment says:
"Read items first, then count. A writer always updates items before incrementing count (inside the lock), so this order guarantees count <= items.Length."
The code matches this intent:
var items = Volatile.Read(ref _items);
var count = Volatile.Read(ref _count);However, the Grow path undermines the guarantee. Inside Add (under the lock):
Grow()sets_items = newItems(the new, larger array)items[count] = recordwrites intonewItems_count = count + 1
A reader that snapshots _items before Grow sets it (so it gets the old, shorter array), then snapshots _count after step 3, can observe count > items.Length. The Array.Copy(items, result, count) in ToArray() will then throw ArgumentException.
The safe order here is the reverse: read count first, then items. Because the writer increments _count only after writing into _items (whether grown or not), a reader that sees count == N is guaranteed that an _items array of at least length N was published before that count was visible.
Note that the previous review already flagged this exact scenario and the fix that was applied reads items first — which is the unsafe ordering for the Grow path. Please swap the two Volatile.Read lines:
internal (CallRecord?[] Items, int Count) GetSnapshot()
{
var count = Volatile.Read(ref _count); // read count first
var items = Volatile.Read(ref _items); // items.Length is always >= count at this point
return (items, count);
}Remaining Issue 2: Duplicate <summary> XML doc tag (Minor)
File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 56–64
/// <summary>
/// Returns the current items array and count for lock-free iteration.
/// ...
/// </summary>
/// <summary> ← second <summary> is invalid XML doc
/// Read items first, then count...
/// </summary>The second <summary> is silently discarded by the XML doc processor and will generate a compiler warning (CS1570/CS1591 depending on settings). The second block should be <remarks>:
/// <summary>
/// Returns the current items array and count for lock-free iteration.
/// The returned array and count form a consistent snapshot: all items at
/// indices 0..count-1 are guaranteed to be non-null.
/// </summary>
/// <remarks>
/// Read count first, then items. A writer always updates items before incrementing
/// count (inside the lock), so this order guarantees count <= items.Length.
/// </remarks>Remaining Issue 3: _matchers.Length == 0 guard in CountMatchingBuffer is dead code (Cosmetic)
File: TUnit.Mocks/Verification/CallVerificationBuilder.cs, line 117
if (_matchers.Length == 0 || MatchesArguments(record.Arguments))CountMatchingBuffer is only reached from the else branch of WasCalled — the _matchers.Length == 0 fast path exits earlier via GetCallCountFor. Inside this loop the guard is always false, adding cognitive overhead without effect. The condition can be dropped:
if (MatchesArguments(record.Arguments))The same applies to the identical guard in MarkMatchingBuffer.
What Looks Good
The core design is clean and well-structured:
CallArraysatomic swap correctly eliminates all TOCTOU races between the two old separate array fields.CallRecordBufferpre-allocation during setup ensures the invocation hot path never locks or allocates.- The
AddBehaviorCAS fast path with double-check is correctly reasoned and documented. - The non-volatile field rationale (CS0420 with
Interlocked.CompareExchange) is sound and the<remarks>explaining it are helpful. - The
CollectCallRecordssnapshot-semantics documentation is appropriate: concurrent capacity growth during verification is documented as acceptable. - Restoring count-first, mark-on-success in
CallVerificationBuildereliminates the partially-verified-state bug entirely.
Issue 1 (GetSnapshot read order) is the only remaining correctness concern. It is the exact same issue as was raised in the second review pass — the fix that was applied is unfortunately the wrong ordering for the Grow path. The other two items are cosmetic.
There was a problem hiding this comment.
Follow-up Review: perf/mock-engine-lock-free-hot-path (latest commit: 9e56739)
This is a follow-up to the three prior review passes. The latest commit addresses all remaining flagged issues correctly.
All Previously Raised Issues — Now Resolved
| Prior Issue | Status |
|---|---|
Two-array TOCTOU (_callsByMemberId / _callCountByMemberId) |
Fixed — CallArrays atomic swap |
| Optimistic mark-then-unmark corrupting prior verifications | Fixed — count-first, mark-on-success |
GetSnapshot() read order (items before count — unsafe for Grow path) |
Fixed — now reads count first, then items |
Duplicate <summary> XML tag |
Fixed — second block changed to <remarks> |
Dead _matchers.Length == 0 guard in CountMatchingBuffer / MarkMatchingBuffer |
Fixed — removed from both methods |
CAS race in AddBehavior when _behaviors is promoted concurrently |
Fixed — double-check after CAS |
_callsByMemberId non-volatile fast-path read |
Fixed — consolidated into volatile CallArrays |
Duplicate lock(this) antipattern in CallRecordBuffer |
Fixed — private _syncRoot field |
One Remaining Minor Point
CallRecordBuffer._syncRoot uses object instead of System.Threading.Lock
CallRecordBuffer.Add is called on the hot invocation path. The second review pass flagged that MockEngine uses the C# 13 System.Threading.Lock type throughout, but CallRecordBuffer uses a plain object for its per-buffer lock:
private readonly object _syncRoot = new();On .NET 9+, System.Threading.Lock avoids the kernel transition and provides better tooling diagnostics. Since the project multi-targets net8.0;net9.0;net10.0 and CallRecordBuffer is the new hot-path lock, this is worth changing:
private readonly Lock _syncRoot = new();No lock statement changes needed — C# 13's lock keyword handles both object and Lock transparently. This was raised in the prior review and is the only remaining item.
What Looks Good
The GetSnapshot fix is correct. The writer inside the lock always increments _count after writing into _items (directly or via Grow). The lock's release fence ensures both writes are visible to any subsequent Volatile.Read. A reader that snapshots count first sees value N, then reads _items — which must have been at least length N at the time count was published. The documentation in <remarks> now accurately describes this invariant.
The overall design is clean and ready to merge pending the Lock consistency fix.
Replace the shared MockEngine.Lock with per-member lock-free call recording and lock-free verification reads. This targets the three benchmark categories where TUnit.Mocks loses to source-generated competitors (Imposter, Mockolate): **Callback (60% slower than Imposter)** - StoreCallRecord no longer acquires MockEngine.Lock on the hot path - CallRecordBuffer uses a lightweight per-buffer lock instead of the shared engine lock, eliminating contention between setup and invocation - Buffers are pre-created during setup so invocations never allocate or initialize under lock **Setup (18% slower than Mockolate)** - AddBehavior uses Interlocked.CompareExchange for the common single-behavior case, avoiding Lock object allocation and lock acquisition entirely **Verification (13% slower than Imposter)** - Single-pass verification: count and mark verified in one iteration instead of two. Only unmarks on failure (rare path). - GetCallBufferFor returns the internal buffer directly, eliminating the .ToArray() copy allocation on the verification happy path - GetCallsFor, MarkCallsVerified, GetAllCalls, GetUnverifiedCalls are now lock-free reads All 822 mock tests pass (761 + 31 + 30).
- Replace lock(this) with private _syncRoot object in CallRecordBuffer - Add GetSnapshot() to cache array reference for tight iteration loops, eliminating N volatile reads in CountAndMarkBuffer - Remove misleading AggressiveInlining on Add (JIT can't inline lock bodies) - Add double-check after CAS in AddBehavior to prevent race with concurrent AddBehaviorSlow on the same MethodSetup - Fix inaccurate "single-writer"/"lock-free" comment on _callsByMemberId - Remove unnecessary WHAT-style doc comments
… restore count-first verification - Bundle _callsByMemberId/_callCountByMemberId into single volatile CallArrays reference to eliminate TOCTOU race between the two fields - Fix GetSnapshot() read order: items before count ensures count <= items.Length after Grow() swaps the array - Restore count-first, mark-on-success verification pattern to avoid corrupting prior verifications on assertion failure
…hot safety - Document why _singleBehavior/_behaviors aren't volatile (CS0420 with Interlocked.CompareExchange) and that all accesses use Volatile.Read/Write - Document CAS full-barrier safety in AddBehavior fast path - Document CollectCallRecords snapshot semantics for concurrent capacity growth
- GetSnapshot: read count before items (count-first is safe because writer updates items/Grow before incrementing count — seeing new count guarantees items.Length >= count) - Fix duplicate <summary> tag → <remarks> - Remove dead _matchers.Length == 0 guard in CountMatchingBuffer/MarkMatchingBuffer (already handled by the fast path in WasCalled)
Consistent with MockEngine; faster on .NET 9+ and better diagnostics.
9e56739 to
51fb8b6
Compare
There was a problem hiding this comment.
Follow-up Review: perf/mock-engine-lock-free-hot-path (final pass)
This review picks up from the four previous passes. All issues identified in prior rounds have been addressed in the current code. Here is the final status.
All Previously Raised Issues — Now Resolved
| Prior Issue | Status |
|---|---|
Two-array TOCTOU (_callsByMemberId / _callCountByMemberId) |
Fixed — CallArrays atomic swap |
| Optimistic mark-then-unmark corrupting prior verifications | Fixed — count-first, mark-on-success |
GetSnapshot() read order (items before count — unsafe for Grow path) |
Fixed — reads count first, then items; <remarks> accurately documents the invariant |
Duplicate <summary> XML tag |
Fixed — second block is now <remarks> |
Dead _matchers.Length == 0 guard in CountMatchingBuffer / MarkMatchingBuffer |
Fixed — removed from both methods |
CAS race in AddBehavior when _behaviors is promoted concurrently |
Fixed — double-check after CAS, falls through to AddBehaviorSlow which clears the CAS-set _singleBehavior |
_callsByMemberId non-volatile fast-path read |
Fixed — consolidated into volatile CallArrays |
lock(this) antipattern in CallRecordBuffer |
Fixed — private _syncRoot field |
CallRecordBuffer._syncRoot using object instead of System.Threading.Lock |
Fixed — now private readonly Lock _syncRoot = new() |
Code Correctness Verification
GetSnapshot read order is now correct. Reads _count first (line 69), then _items (line 70). The writer inside the lock always increments _count after writing into _items (whether via direct write or Grow), and the lock's release fence ensures both stores are visible before any subsequent Volatile.Read. A reader that sees count == N is therefore guaranteed to see an _items with Length >= N. The <remarks> accurately documents this invariant.
AddBehaviorSlow correctly handles the CAS-then-fall-through case. When the fast-path CAS wins _singleBehavior but the double-check reveals _behaviors != null (a concurrent promotion happened), AddBehaviorSlow is called. Inside the lock it sees _behaviors != null, skips the promote-to-list step, calls _behaviors!.Add(behavior), and then Volatile.Write(ref _singleBehavior, null) — which correctly clears the CAS value that was transiently set. The behavior is not lost and does not appear twice.
CallArrays atomic swap eliminates all TOCTOU. EnsureCallArrayCapacity is always called under Lock, so the swap is safe. All lock-free readers take a single volatile read of _callArrays and get a consistent generation of both Buffers and Counts.
One Very Minor Observation (not a blocker)
AddBehavior window where _singleBehavior is transiently visible before the double-check.
When the fast-path CAS succeeds (line 84) but _behaviors is already non-null (detected at line 88), there is a brief window between lines 84 and 88 where _singleBehavior is set and _behaviors is also non-null. A concurrent GetNextBehavior reading in this window would return _singleBehavior (the fast path) and not look at _behaviors. In practice, setup is nearly always single-threaded and concurrent invocations before setup is complete are unsupported usage, so this is an academic concern rather than a practical one. The double-check correctly reconciles the state before returning, and the subsequent AddBehaviorSlow clears _singleBehavior. Documenting the "setup is not thread-safe" contract at the AddBehavior level (if not already documented elsewhere) would be the appropriate response — no code change required.
Summary
This PR is well-crafted. The motivation is benchmark-driven, the design decisions are documented with accurate reasoning, all identified correctness issues from prior reviews have been addressed, and the final implementation is consistent and clean. The CallArrays atomic swap, per-buffer Lock, correct GetSnapshot ordering, and count-first-mark-on-success verification pattern together form a coherent and safe lock-free design for the hot paths.
Ready to merge.
Updated [TUnit](https://github.com/thomhurst/TUnit) from 1.29.0 to 1.30.0. <details> <summary>Release notes</summary> _Sourced from [TUnit's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.30.0 <!-- Release notes generated using configuration in .github/release.yml at v1.30.0 --> ## What's Changed ### Other Changes * perf: eliminate locks from mock invocation and verification hot paths by @thomhurst in thomhurst/TUnit#5422 * feat: TUnit0074 analyzer for redundant hook attributes on overrides by @thomhurst in thomhurst/TUnit#5459 * fix(mocks): respect generic type argument accessibility (#5453) by @thomhurst in thomhurst/TUnit#5460 * fix(mocks): skip inaccessible internal accessors when mocking Azure.Response by @thomhurst in thomhurst/TUnit#5461 * fix: apply CultureAttribute and STAThreadExecutorAttribute to hooks (#5452) by @thomhurst in thomhurst/TUnit#5463 ### Dependencies * chore(deps): update tunit to 1.29.0 by @thomhurst in thomhurst/TUnit#5446 * chore(deps): update react to ^19.2.5 by @thomhurst in thomhurst/TUnit#5457 * chore(deps): update opentelemetry to 1.15.2 by @thomhurst in thomhurst/TUnit#5456 * chore(deps): update dependency qs to v6.15.1 by @thomhurst in thomhurst/TUnit#5458 **Full Changelog**: thomhurst/TUnit@v1.29.0...v1.30.0 Commits viewable in [compare view](thomhurst/TUnit@v1.29.0...v1.30.0). </details> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Summary
MockEngine.Lockwith pre-allocated per-memberCallRecordBufferthat uses lightweight per-buffer locks, eliminating contention between setup and invocation phasesInterlocked.CompareExchangefor the common single-behavior setup case, avoidingLockobject allocation and lock acquisition entirely.ToArray()allocation on the happy pathThese changes target the three mock benchmark categories where TUnit.Mocks currently loses to source-generated competitors:
StoreCallRecord+ Lock alloc inAddBehaviorAddBehaviorfor single-behavior case.ToArray()copy allocationAll 822 mock tests pass (761 runtime + 31 source-gen + 30 analyzer).
Test plan
TUnit.Mocks.Tests— 761 passed (includes concurrent mock access tests)TUnit.Mocks.SourceGenerator.Tests— 31 passedTUnit.Mocks.Analyzers.Tests— 30 passed