perf: eliminate locks from mock invocation and verification hot paths by thomhurst · Pull Request #5422 · thomhurst/TUnit

thomhurst · 2026-04-06T09:46:37Z

Summary

Lock-free call recording: Replace the shared MockEngine.Lock with pre-allocated per-member CallRecordBuffer that uses lightweight per-buffer locks, eliminating contention between setup and invocation phases
Lock-free AddBehavior: Use Interlocked.CompareExchange for the common single-behavior setup case, avoiding Lock object allocation and lock acquisition entirely
Single-pass verification: Count and mark verified in one iteration instead of two, with zero-copy buffer access eliminating .ToArray() allocation on the happy path

These changes target the three mock benchmark categories where TUnit.Mocks currently loses to source-generated competitors:

Benchmark	Gap vs Best Rival	Root Cause Addressed
Callback	60% slower than Imposter	Shared lock in `StoreCallRecord` + Lock alloc in `AddBehavior`
Setup	18% slower than Mockolate	Lock acquisition in `AddBehavior` for single-behavior case
Verification	13% slower than Imposter	Two-pass iteration + `.ToArray()` copy allocation

All 822 mock tests pass (761 runtime + 31 source-gen + 30 analyzer).

Test plan

TUnit.Mocks.Tests — 761 passed (includes concurrent mock access tests)
TUnit.Mocks.SourceGenerator.Tests — 31 passed
TUnit.Mocks.Analyzers.Tests — 30 passed
CI mock benchmark workflow to measure actual improvement

codacy-production · 2026-04-06T09:49:34Z

Not up to standards ⛔

🔴 Issues 3 minor

Alerts:
⚠ 3 issues (≤ 0 issues of at least minor severity)

Results:
3 new issues

Category Results

CodeStyle 3 minor

View in Codacy

🟢 Metrics 25 complexity

Metric Results

Complexity 25

View in Codacy

_{TIP This summary will be updated as you push new changes. Give us feedback}

claude

Code Review: perf/mock-engine-lock-free-hot-path

The motivation here is clear and the benchmark data is well-presented. The approach — per-buffer locks, CAS for single-behavior setup, and single-pass verification — is the right direction. However, there are several correctness issues that need addressing before merge.

🚨 Critical: Bounds mismatch between `calls` and `counts` arrays

File: MockEngine.cs — StoreCallRecord fast path

var calls  = _callsByMemberId;       // non-volatile read — could be generation N+1
var counts = _callCountByMemberId;   // non-volatile read — could be generation N

if (calls is not null && counts is not null && (uint)memberId < (uint)calls.Length)
{
    var buffer = calls[memberId];
    if (buffer is not null)
    {
        buffer.Add(record);
        Interlocked.Increment(ref counts[memberId]);  // OOB if counts is old array
        return record;
    }
}

The bounds check only guards against calls.Length, not counts.Length. EnsureCallArrayCapacity writes _callsByMemberId = newByMember then _callCountByMemberId = newCounts as two separate non-volatile stores. A racing reader could see the new (larger) calls array while still seeing the old (smaller) counts array. If memberId is in range of the new array but not the old, Interlocked.Increment(ref counts[memberId]) throws IndexOutOfRangeException.

Fix: Bundle both arrays into a single struct/record and replace them atomically:

private volatile CallArrays? _callArrays;  // single volatile reference swap

private sealed class CallArrays
{
    public readonly CallRecordBuffer?[] Calls;
    public readonly int[] Counts;
    public CallArrays(int size) { Calls = new CallRecordBuffer?[size]; Counts = new int[size]; }
}

This eliminates the two-array TOCTOU entirely — a single Volatile.Read gives you a consistent snapshot of both.

🚨 Critical: "Optimistic mark + unmark" corrupts previously-verified records

File: CallVerificationBuilder.cs — WasCalled

var matchingCount = CountAndMarkBuffer(buffer, markVerified: true);   // marks ALL matching

if (!times.Matches(matchingCount))
{
    CountAndMarkBuffer(buffer, markVerified: false);  // unmarks ALL matching — including ones
                                                      // verified by a PRIOR WasCalled call!
    throw ...;
}

CountAndMarkBuffer with markVerified: false unconditionally sets IsVerified = false on every matching record. If a record was already verified by a previous WasCalled assertion, the failed second assertion will un-verify it, causing VerifyNoOtherCalls() to report a false failure.

The original two-pass approach was correct: count first, then mark only if the count passes. Restoring that contract is the right fix here. The allocation of GetCallsFor that this avoids is only on the throw path anyway — the exception itself is far more expensive than an array allocation.

// Safe single-pass approach: count first, never unmark
var matchingCount = CountMatchingBuffer(buffer);

if (!times.Matches(matchingCount))
{
    throw new MockVerificationException(...);
}

// Only mark after successful assertion
MarkMatchingBuffer(buffer);

⚠️ Significant: CAS race in `AddBehavior` can leave behaviors split across both fields

File: MethodSetup.cs — AddBehavior

Consider this interleaving:

Thread A: Volatile.Read(ref _behaviors) is null → true
Thread B (in AddBehaviorSlow): promotes to list, Volatile.Write(ref _behaviors, [existing]), then Volatile.Write(ref _singleBehavior, null)
Thread A: CAS on _singleBehavior with expected null — succeeds (B just cleared it!)
Thread A returns. Now _singleBehavior is set AND _behaviors is non-null.

GetNextBehavior checks _singleBehavior first and returns it early, so the behaviors in _behaviors are silently ignored.

The fix is to re-check _behaviors after a successful CAS, or check both conditions atomically under the lock:

public void AddBehavior(IBehavior behavior)
{
    // CAS is only safe if _behaviors is still null after the exchange.
    // Re-read _behaviors after CAS to detect the race.
    if (Volatile.Read(ref _behaviors) is null &&
        Interlocked.CompareExchange(ref _singleBehavior, behavior, null) is null &&
        Volatile.Read(ref _behaviors) is null)  // double-check after CAS
    {
        return;
    }
    AddBehaviorSlow(behavior);
}

ℹ️ Minor: `lock(this)` in `CallRecordBuffer`

File: CallRecordBuffer.cs — Add

lock(this) on internal sealed class is low-risk since no external code can grab the monitor, but it's a well-known antipattern. A private dedicated _lock object is the idiomatic replacement and costs one extra pointer. This is a style preference — not a blocker.

ℹ️ Minor: `_callsByMemberId` non-volatile field now accessed without lock

The field was always lock-protected before; the fast path now reads it without Volatile.Read. The JIT is allowed to hoist this load. Marking the field volatile (or using Volatile.Read) restores the happens-before guarantee that the previous lock provided for free.

What's good

The CallRecordBuffer design (pre-allocated, per-member lock, volatile count publish) is solid and the API surface is clean.
Splitting StoreCallRecord into fast/slow paths is idiomatic and [MethodImpl(MethodImplOptions.NoInlining)] on the slow path is the right call.
Replacing the shared MockEngine.Lock with per-buffer locks is the correct architectural move; the contention analysis in the PR description is accurate.
The CollectInto / ForEach zero-copy iteration API on CallRecordBuffer is a nice improvement over the previous AddRange(list) pattern.

The issues above are real correctness risks in concurrent test runs. The two critical ones (bounds mismatch and optimistic unmark) are straightforward to fix and the overall approach is worth landing once they're addressed.

claude · 2026-04-06T09:53:53Z

Code Review: perf: eliminate locks from mock invocation and verification hot paths

Overall this is a well-motivated performance PR with a clear benchmark-driven rationale. The design is largely sound, but there are several correctness and design concerns worth addressing before merge.

1. `GetSnapshot()` has a torn read hazard in `CallRecordBuffer`

The lock-free snapshot in GetSnapshot() reads _count and _items in two separate volatile reads:

internal (CallRecord?[] Items, int Count) GetSnapshot()
{
    var count = Volatile.Read(ref _count);
    var items = Volatile.Read(ref _items);
    return (items, count);
}

Between the two reads, a concurrent Grow() + Add() can replace _items with a larger array while _count has already been incremented past what the old array contains. The result would be a count that reflects the new array but items pointing to the old, shorter array — leading to an IndexOutOfRangeException or accessing stale null slots beyond the original array capacity.

Fix: Read _items first, then _count. A writer always increments _count last (inside the lock, after updating _items), so reading in that order gives a safe lower-bound count that is always consistent with the captured array reference:

internal (CallRecord?[] Items, int Count) GetSnapshot()
{
    var items = Volatile.Read(ref _items);   // read array reference first
    var count = Volatile.Read(ref _count);   // count is always <= items.Length at this point
    return (items, count);
}

2. `AddBehavior` CAS has a silent data loss window

The fast-path CAS in AddBehavior has a subtle hole:

if (Volatile.Read(ref _behaviors) is null
    && Interlocked.CompareExchange(ref _singleBehavior, behavior, null) is null)
{
    if (Volatile.Read(ref _behaviors) is null)
        return;   // early return — behavior "installed"
}
AddBehaviorSlow(behavior);

If a second concurrent AddBehavior call also wins the CAS (impossible for the same CAS, but consider the "lost CAS" case) — actually the larger problem is the double-check on _behaviors. When the first CAS succeeds and _behaviors is still null at the double-check, the method returns. But _singleBehavior is now set and _behaviors is null. Then a concurrent AddBehaviorSlow runs, reads _singleBehavior as null (not yet set because of CPU reordering — _singleBehavior is no longer volatile after this PR), initializes _behaviors to [], and adds the new behavior, losing the first behavior entirely.

Root cause: Removing volatile from _singleBehavior and _behaviors (lines 25–26) means the CAS in the fast path provides sequential consistency only for _singleBehavior, not for _behaviors. The Volatile.Read guards are correct but the non-volatile field declaration means the JIT / CPU is free to cache old values between the calls to Volatile.Read.

The fields must remain volatile, or all accesses must go through Volatile.Read/Volatile.Write (which the fast path CAS already does). Since Interlocked.CompareExchange on a non-volatile field is valid, the deeper issue is that AddBehaviorSlow reads _singleBehavior with Volatile.Read but the window between Volatile.Read(ref _behaviors) returning null and reading _singleBehavior is not atomic, so a concurrent CAS could succeed in between.

This is a narrow race but it is a real correctness issue for concurrent setup scenarios.

3. `_callsByMemberId` is no longer `volatile` but is read lock-free in several places

In the original code _callsByMemberId was implicitly guarded by Lock. The new code reads it without a lock in StoreCallRecord (fast path), GetCallsFor, GetCallBufferFor, MarkCallsVerified, GetAllCalls, GetUnverifiedCalls, GetDiagnostics, and CollectCallRecords. However the field is declared as:

private CallRecordBuffer?[]? _callsByMemberId;

There is no volatile keyword and no Volatile.Read when this field is read. In StoreCallRecord, the local copy is taken with a plain read:

var calls = _callsByMemberId;

Without volatile, the JIT is allowed to hoist this read out of loops or cache it in a register, meaning a freshly-allocated array written in StoreCallRecordSlow might never be visible on the fast path. _callCountByMemberId is volatile (correct), but _callsByMemberId is not. This should either be declared volatile or all reads should use Volatile.Read.

4. Optimistic mark-then-unmark in `WasCalled` has observable side effects

var matchingCount = CountAndMarkBuffer(buffer, markVerified: true);

if (!times.Matches(matchingCount))
{
    // Unmark on failure — rare path
    CountAndMarkBuffer(buffer, markVerified: false);
    ...
    throw new MockVerificationException(...);
}

The comment acknowledges the unmark-on-failure path. However, this creates a TOCTOU window: another concurrent verification that runs between the optimistic mark and the unmark will observe calls as already verified and skip them. In single-threaded test teardown this is unlikely, but VerifyAll and VerifyNoOtherCalls are often called from cleanup hooks that could run concurrently across tests. The previous two-pass approach (count first, mark second) was safer because it never left a partially-verified state visible.

If the performance win is important, consider separating the pass into count-only (no writes), then mark only on success — matching the old behavior but without the ToArray() allocation, by iterating the buffer directly.

5. `GetAllCalls` and `GetUnverifiedCalls` are no longer protected by a lock

These two methods previously held Lock for the duration of CollectCallRecords. The new code calls them lock-free. While individual buffer accesses via GetSnapshot() are safe (see issue #1 above), the outer loop over _callsByMemberId is not atomic:

foreach (var buffer in _callsByMemberId)
{
    buffer?.CollectInto(target, filter);
}

If EnsureCallArrayCapacity replaces _callsByMemberId with a new larger array during this loop (via StoreCallRecordSlow), the foreach is iterating over a snapshot of the old array reference (captured by the foreach enumerator), which is fine. But if a concurrent setup call triggers EnsureCallArrayCapacity, the new buffers for newly-added members will not appear in this snapshot. This is probably acceptable for a test framework, but should be documented explicitly.

Minor / Design suggestions

GetCallBufferFor is internal: exposing the mutable buffer directly to CallVerificationBuilder tightly couples the two types. A VerifyWithMatchers(IArgumentMatcher[], Times, ...) method on MockEngine would keep the buffer encapsulated.
Predicate<T> → Func<T, bool> change in CollectCallRecords: functionally identical but inconsistent with other uses of Predicate<T> in the codebase. Not a blocker, but pick one and be consistent.
CallRecordBuffer initial capacity of 4: Fine for most cases. Worth a note that this is a trade-off — lower capacity means more Grow() calls for heavily-exercised members.

Summary

Issue	Severity
`GetSnapshot()` torn read (`_count` before `_items`)	High — can cause OOBE or stale null slots
`_callsByMemberId` missing `volatile` on lock-free reads	High — can cause invisible writes on some JIT/CPU combos
`_singleBehavior`/`_behaviors` removed `volatile` keyword	Medium — CAS provides ordering for those fields but dependent reads in slow path may not see updates
Optimistic mark-then-unmark in `WasCalled`	Low — narrow race in concurrent teardown
`GetAllCalls`/`GetUnverifiedCalls` lock removal	Low — acceptable with a documentation note

The performance direction is correct. Fixing the GetSnapshot() read order and adding volatile back to _callsByMemberId (or using Volatile.Read consistently) would make this safe to merge.

claude

Code Review: perf/mock-engine-lock-free-hot-path

Summary

This PR replaces TUnit.Mocks' shared MockEngine.Lock in the invocation hot path with three complementary lock-free strategies:

CallArrays wrapper — Bundles CallRecordBuffer?[] and int[] into a single volatile object reference so both arrays are swapped atomically, eliminating the TOCTOU bug from the previous separate volatile array fields.
CallRecordBuffer — Per-member append-only buffer with its own small _syncRoot. Writers lock only their buffer; readers take a lock-free snapshot via Volatile.Read.
AddBehavior CAS fast path — Uses Interlocked.CompareExchange to avoid allocating and acquiring a Lock object when only a single behavior is registered (the common case).

The prior review comment about the two-array TOCTOU bug was correctly diagnosed, and this PR directly resolves it with the CallArrays wrapper. That fix is solid.

Positive Aspects

CallArrays atomic swap is the right fix. Bundling both arrays into one volatile reference is the textbook pattern for this problem. Since EnsureCallArrayCapacity is always called under Lock, the swap itself is safe, and readers outside the lock get a consistent generation.

CallRecordBuffer.GetSnapshot memory ordering intent is sound. The code attempts to exploit the "writer stores element before incrementing count" invariant, which is a well-known lock-free pattern.

[MethodImpl(MethodImplOptions.NoInlining)] on slow paths keeps hot-path JIT output cleaner and prevents lock-allocation code from being inlined.

Removal of .ToArray() in CallVerificationBuilder on the happy path (when count matches) is a genuine allocation win for the "test passes" case.

Issues Found

1. `GetSnapshot` read order is inverted — potential torn read on Grow (Medium)

File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 68–70

var items = Volatile.Read(ref _items);   // reads items FIRST
var count = Volatile.Read(ref _count);   // reads count SECOND

The comment says "Read items first, then count. A writer always updates items before incrementing count." However, consider the Grow path:

Grow() sets _items = newItems (the new, larger array)
Add writes items[count] = record
Add increments _count

A reader that snaps _items (pre-grow, smaller array) and then snaps _count (post-grow, larger count) can produce a snapshot where count > items.Length. Passing this to Array.Copy(items, result, count) will throw ArgumentException or silently read stale/garbage entries.

The safe pattern for this classic "array + count" lock-free snapshot is the reverse: read count first, then items. A reader that sees count == N is guaranteed to see an _items that was at least length N at that point.

Alternatively, use Volatile.Write(ref _items, newItems) inside Grow() to publish the new array with a store-release fence before any subsequent count increment is visible.

// Safe: read count first, then items
var count = Volatile.Read(ref _count);
var items = Volatile.Read(ref _items);
return (items, count);

2. CAS fast path in `AddBehavior` has a double-dispatch window (Low)

File: TUnit.Mocks/Setup/MethodSetup.cs, lines 76–87

if (Volatile.Read(ref _behaviors) is null                    // (A)
    && Interlocked.CompareExchange(ref _singleBehavior, behavior, null) is null)  // (B)
{
    if (Volatile.Read(ref _behaviors) is null)               // (C)
        return;
}
AddBehaviorSlow(behavior);

Consider this interleaving during concurrent setup:

Thread 2 runs AddBehaviorSlow fully: promotes to list, sets _behaviors = [existing], clears _singleBehavior.
Thread 1 passes (A) (saw null), CAS (B) succeeds, but (C) now sees _behaviors != null.
Thread 1 calls AddBehaviorSlow and adds behavior to the list.

BUT between (B) and (C), _singleBehavior = behavior is briefly visible to GetNextBehavior, which reads it without a lock. A concurrent invocation thread could dispatch behavior during this window, then dispatch it again once it's in _behaviors. In practice, mock setup is usually single-threaded before invocations, but the invariant is wrong. Additionally, AddBehaviorSlow captures _singleBehavior to seed the list — if this CAS value was already captured, behavior could appear twice.

3. `CallRecordBuffer` uses `object` lock instead of `Lock` — inconsistent (Minor)

File: TUnit.Mocks/Verification/CallRecordBuffer.cs, line 16

private readonly object _syncRoot = new();

MockEngine uses C# 13's System.Threading.Lock type, which provides better diagnostics and a faster implementation on .NET 9+. CallRecordBuffer's per-buffer lock is acquired on every write (the hot path), so this is the most impactful place to use Lock. Since the project targets net8.0;net9.0;net10.0, changing to Lock would be consistent and deliver additional benefit on newer runtimes.

4. Duplicate `<summary>` XML doc in `CallRecordBuffer.cs` (Minor)

File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 56–64

/// <summary>
/// Returns the current items array and count for lock-free iteration.
/// ...
/// </summary>
/// <summary>                        ← invalid; silently ignored by XML doc processor
/// Read items first, then count. ...
/// </summary>

The second <summary> block should be a <remarks> element.

5. `_matchers.Length == 0` guard in `CountMatchingBuffer` is always false at that call site (Cosmetic)

File: TUnit.Mocks/Verification/CallVerificationBuilder.cs, line 117

The _matchers.Length == 0 fast-exit inside the loop is only reached from the else branch of WasCalled, which already handles the zero-matchers case via GetCallCountFor. The condition is always false here, adding cognitive overhead. The guard can be removed or hoisted outside the loop.

Summary Table

#	Severity	Location	Issue
1	Medium	`CallRecordBuffer.cs:68-70`	`GetSnapshot` reads `_items` before `_count`; concurrent `Grow` can produce torn snapshot with pre-grow array and post-grow count
2	Low	`MethodSetup.cs:76-87`	CAS fast path window where `_singleBehavior` is set but `_behaviors` is already a list; concurrent reader could double-dispatch
3	Minor	`CallRecordBuffer.cs:16`	Uses `object` lock instead of `Lock` (C# 13), inconsistent with rest of codebase
4	Minor	`CallRecordBuffer.cs:56-64`	Duplicate `<summary>` tags; second should be `<remarks>`
5	Cosmetic	`CallVerificationBuilder.cs:117`	`_matchers.Length == 0` check in loop is always false at this call site

Recommendation

The core architectural change (atomic CallArrays swap, per-buffer locking, lock-free single-behavior CAS) is sound and a meaningful improvement over the previous lock-everywhere approach. Issue #1 (GetSnapshot read order) is the only one that affects memory safety correctness and should be fixed before merge — the Grow path creates a subtle torn-read hazard. Issues #2 and #3 are worth addressing. The rest are cosmetic.

claude

Follow-up Review: perf/mock-engine-lock-free-hot-path

Thank you for the four follow-up commits. Most of the previously raised issues have been addressed correctly. Here is a status summary and the two remaining issues.

Issues Addressed Since Prior Reviews

Prior Issue	Status
Two-array TOCTOU (`_callsByMemberId` / `_callCountByMemberId`)	Fixed — bundled into `CallArrays` with a single `volatile` reference swap
Optimistic mark-then-unmark corrupting prior verifications	Fixed — restored count-first, mark-on-success pattern
`GetSnapshot()` torn read (count before items)	Fixed — now reads `_items` before `_count`, consistent with writer ordering
`lock(this)` in `CallRecordBuffer`	Fixed — replaced with private `_syncRoot` field
CAS race in `AddBehavior` when `_behaviors` is promoted concurrently	Fixed — double-check after CAS (`Volatile.Read(ref _behaviors) is null` on re-entry)
`_callsByMemberId` non-volatile fast-path read	Fixed — consolidated into `CallArrays` which is `volatile`
Duplicate `<summary>` XML tag in `CallRecordBuffer`	Not fixed — still present (minor, see below)

Remaining Issue 1: `GetSnapshot()` read order is still the Grow-unsafe order (Medium)

File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 68–74

The comment says:

"Read items first, then count. A writer always updates items before incrementing count (inside the lock), so this order guarantees count <= items.Length."

The code matches this intent:

var items = Volatile.Read(ref _items);
var count = Volatile.Read(ref _count);

However, the Grow path undermines the guarantee. Inside Add (under the lock):

Grow() sets _items = newItems (the new, larger array)
items[count] = record writes into newItems
_count = count + 1

A reader that snapshots _items before Grow sets it (so it gets the old, shorter array), then snapshots _count after step 3, can observe count > items.Length. The Array.Copy(items, result, count) in ToArray() will then throw ArgumentException.

The safe order here is the reverse: read count first, then items. Because the writer increments _count only after writing into _items (whether grown or not), a reader that sees count == N is guaranteed that an _items array of at least length N was published before that count was visible.

Note that the previous review already flagged this exact scenario and the fix that was applied reads items first — which is the unsafe ordering for the Grow path. Please swap the two Volatile.Read lines:

internal (CallRecord?[] Items, int Count) GetSnapshot()
{
    var count = Volatile.Read(ref _count);   // read count first
    var items = Volatile.Read(ref _items);   // items.Length is always >= count at this point
    return (items, count);
}

Remaining Issue 2: Duplicate `<summary>` XML doc tag (Minor)

File: TUnit.Mocks/Verification/CallRecordBuffer.cs, lines 56–64

/// <summary>
/// Returns the current items array and count for lock-free iteration.
/// ...
/// </summary>
/// <summary>                      ← second <summary> is invalid XML doc
/// Read items first, then count...
/// </summary>

The second <summary> is silently discarded by the XML doc processor and will generate a compiler warning (CS1570/CS1591 depending on settings). The second block should be <remarks>:

/// <summary>
/// Returns the current items array and count for lock-free iteration.
/// The returned array and count form a consistent snapshot: all items at
/// indices 0..count-1 are guaranteed to be non-null.
/// </summary>
/// <remarks>
/// Read count first, then items. A writer always updates items before incrementing
/// count (inside the lock), so this order guarantees count &lt;= items.Length.
/// </remarks>

Remaining Issue 3: `_matchers.Length == 0` guard in `CountMatchingBuffer` is dead code (Cosmetic)

File: TUnit.Mocks/Verification/CallVerificationBuilder.cs, line 117

if (_matchers.Length == 0 || MatchesArguments(record.Arguments))

CountMatchingBuffer is only reached from the else branch of WasCalled — the _matchers.Length == 0 fast path exits earlier via GetCallCountFor. Inside this loop the guard is always false, adding cognitive overhead without effect. The condition can be dropped:

if (MatchesArguments(record.Arguments))

The same applies to the identical guard in MarkMatchingBuffer.

What Looks Good

The core design is clean and well-structured:

CallArrays atomic swap correctly eliminates all TOCTOU races between the two old separate array fields.
CallRecordBuffer pre-allocation during setup ensures the invocation hot path never locks or allocates.
The AddBehavior CAS fast path with double-check is correctly reasoned and documented.
The non-volatile field rationale (CS0420 with Interlocked.CompareExchange) is sound and the <remarks> explaining it are helpful.
The CollectCallRecords snapshot-semantics documentation is appropriate: concurrent capacity growth during verification is documented as acceptable.
Restoring count-first, mark-on-success in CallVerificationBuilder eliminates the partially-verified-state bug entirely.

Issue 1 (GetSnapshot read order) is the only remaining correctness concern. It is the exact same issue as was raised in the second review pass — the fix that was applied is unfortunately the wrong ordering for the Grow path. The other two items are cosmetic.

claude

Follow-up Review: perf/mock-engine-lock-free-hot-path (latest commit: `9e56739`)

This is a follow-up to the three prior review passes. The latest commit addresses all remaining flagged issues correctly.

All Previously Raised Issues — Now Resolved

Prior Issue	Status
Two-array TOCTOU (`_callsByMemberId` / `_callCountByMemberId`)	Fixed — `CallArrays` atomic swap
Optimistic mark-then-unmark corrupting prior verifications	Fixed — count-first, mark-on-success
`GetSnapshot()` read order (items before count — unsafe for Grow path)	Fixed — now reads count first, then items
Duplicate `<summary>` XML tag	Fixed — second block changed to `<remarks>`
Dead `_matchers.Length == 0` guard in `CountMatchingBuffer` / `MarkMatchingBuffer`	Fixed — removed from both methods
CAS race in `AddBehavior` when `_behaviors` is promoted concurrently	Fixed — double-check after CAS
`_callsByMemberId` non-volatile fast-path read	Fixed — consolidated into `volatile CallArrays`
Duplicate `lock(this)` antipattern in `CallRecordBuffer`	Fixed — private `_syncRoot` field

One Remaining Minor Point

CallRecordBuffer._syncRoot uses object instead of System.Threading.Lock

CallRecordBuffer.Add is called on the hot invocation path. The second review pass flagged that MockEngine uses the C# 13 System.Threading.Lock type throughout, but CallRecordBuffer uses a plain object for its per-buffer lock:

private readonly object _syncRoot = new();

On .NET 9+, System.Threading.Lock avoids the kernel transition and provides better tooling diagnostics. Since the project multi-targets net8.0;net9.0;net10.0 and CallRecordBuffer is the new hot-path lock, this is worth changing:

private readonly Lock _syncRoot = new();

No lock statement changes needed — C# 13's lock keyword handles both object and Lock transparently. This was raised in the prior review and is the only remaining item.

What Looks Good

The GetSnapshot fix is correct. The writer inside the lock always increments _count after writing into _items (directly or via Grow). The lock's release fence ensures both writes are visible to any subsequent Volatile.Read. A reader that snapshots count first sees value N, then reads _items — which must have been at least length N at the time count was published. The documentation in <remarks> now accurately describes this invariant.

The overall design is clean and ready to merge pending the Lock consistency fix.

Replace the shared MockEngine.Lock with per-member lock-free call recording and lock-free verification reads. This targets the three benchmark categories where TUnit.Mocks loses to source-generated competitors (Imposter, Mockolate): **Callback (60% slower than Imposter)** - StoreCallRecord no longer acquires MockEngine.Lock on the hot path - CallRecordBuffer uses a lightweight per-buffer lock instead of the shared engine lock, eliminating contention between setup and invocation - Buffers are pre-created during setup so invocations never allocate or initialize under lock **Setup (18% slower than Mockolate)** - AddBehavior uses Interlocked.CompareExchange for the common single-behavior case, avoiding Lock object allocation and lock acquisition entirely **Verification (13% slower than Imposter)** - Single-pass verification: count and mark verified in one iteration instead of two. Only unmarks on failure (rare path). - GetCallBufferFor returns the internal buffer directly, eliminating the .ToArray() copy allocation on the verification happy path - GetCallsFor, MarkCallsVerified, GetAllCalls, GetUnverifiedCalls are now lock-free reads All 822 mock tests pass (761 + 31 + 30).

- Replace lock(this) with private _syncRoot object in CallRecordBuffer - Add GetSnapshot() to cache array reference for tight iteration loops, eliminating N volatile reads in CountAndMarkBuffer - Remove misleading AggressiveInlining on Add (JIT can't inline lock bodies) - Add double-check after CAS in AddBehavior to prevent race with concurrent AddBehaviorSlow on the same MethodSetup - Fix inaccurate "single-writer"/"lock-free" comment on _callsByMemberId - Remove unnecessary WHAT-style doc comments

… restore count-first verification - Bundle _callsByMemberId/_callCountByMemberId into single volatile CallArrays reference to eliminate TOCTOU race between the two fields - Fix GetSnapshot() read order: items before count ensures count <= items.Length after Grow() swaps the array - Restore count-first, mark-on-success verification pattern to avoid corrupting prior verifications on assertion failure

…hot safety - Document why _singleBehavior/_behaviors aren't volatile (CS0420 with Interlocked.CompareExchange) and that all accesses use Volatile.Read/Write - Document CAS full-barrier safety in AddBehavior fast path - Document CollectCallRecords snapshot semantics for concurrent capacity growth

- GetSnapshot: read count before items (count-first is safe because writer updates items/Grow before incrementing count — seeing new count guarantees items.Length >= count) - Fix duplicate <summary> tag → <remarks> - Remove dead _matchers.Length == 0 guard in CountMatchingBuffer/MarkMatchingBuffer (already handled by the fast path in WasCalled)

Consistent with MockEngine; faster on .NET 9+ and better diagnostics.

claude

Follow-up Review: perf/mock-engine-lock-free-hot-path (final pass)

This review picks up from the four previous passes. All issues identified in prior rounds have been addressed in the current code. Here is the final status.

All Previously Raised Issues — Now Resolved

Prior Issue	Status
Two-array TOCTOU (`_callsByMemberId` / `_callCountByMemberId`)	Fixed — `CallArrays` atomic swap
Optimistic mark-then-unmark corrupting prior verifications	Fixed — count-first, mark-on-success
`GetSnapshot()` read order (items before count — unsafe for Grow path)	Fixed — reads `count` first, then `items`; `<remarks>` accurately documents the invariant
Duplicate `<summary>` XML tag	Fixed — second block is now `<remarks>`
Dead `_matchers.Length == 0` guard in `CountMatchingBuffer` / `MarkMatchingBuffer`	Fixed — removed from both methods
CAS race in `AddBehavior` when `_behaviors` is promoted concurrently	Fixed — double-check after CAS, falls through to `AddBehaviorSlow` which clears the CAS-set `_singleBehavior`
`_callsByMemberId` non-volatile fast-path read	Fixed — consolidated into `volatile CallArrays`
`lock(this)` antipattern in `CallRecordBuffer`	Fixed — private `_syncRoot` field
`CallRecordBuffer._syncRoot` using `object` instead of `System.Threading.Lock`	Fixed — now `private readonly Lock _syncRoot = new()`

Code Correctness Verification

GetSnapshot read order is now correct. Reads _count first (line 69), then _items (line 70). The writer inside the lock always increments _count after writing into _items (whether via direct write or Grow), and the lock's release fence ensures both stores are visible before any subsequent Volatile.Read. A reader that sees count == N is therefore guaranteed to see an _items with Length >= N. The <remarks> accurately documents this invariant.

AddBehaviorSlow correctly handles the CAS-then-fall-through case. When the fast-path CAS wins _singleBehavior but the double-check reveals _behaviors != null (a concurrent promotion happened), AddBehaviorSlow is called. Inside the lock it sees _behaviors != null, skips the promote-to-list step, calls _behaviors!.Add(behavior), and then Volatile.Write(ref _singleBehavior, null) — which correctly clears the CAS value that was transiently set. The behavior is not lost and does not appear twice.

CallArrays atomic swap eliminates all TOCTOU. EnsureCallArrayCapacity is always called under Lock, so the swap is safe. All lock-free readers take a single volatile read of _callArrays and get a consistent generation of both Buffers and Counts.

One Very Minor Observation (not a blocker)

AddBehavior window where _singleBehavior is transiently visible before the double-check.

When the fast-path CAS succeeds (line 84) but _behaviors is already non-null (detected at line 88), there is a brief window between lines 84 and 88 where _singleBehavior is set and _behaviors is also non-null. A concurrent GetNextBehavior reading in this window would return _singleBehavior (the fast path) and not look at _behaviors. In practice, setup is nearly always single-threaded and concurrent invocations before setup is complete are unsupported usage, so this is an academic concern rather than a practical one. The double-check correctly reconciles the state before returning, and the subsequent AddBehaviorSlow clears _singleBehavior. Documenting the "setup is not thread-safe" contract at the AddBehavior level (if not already documented elsewhere) would be the appropriate response — no code change required.

Summary

This PR is well-crafted. The motivation is benchmark-driven, the design decisions are documented with accurate reasoning, all identified correctness issues from prior reviews have been addressed, and the final implementation is consistent and clean. The CallArrays atomic swap, per-buffer Lock, correct GetSnapshot ordering, and count-first-mark-on-success verification pattern together form a coherent and safe lock-free design for the hot paths.

Ready to merge.

Updated [TUnit](https://github.com/thomhurst/TUnit) from 1.29.0 to 1.30.0. <details> <summary>Release notes</summary> _Sourced from [TUnit's releases](https://github.com/thomhurst/TUnit/releases)._ ## 1.30.0  ## What's Changed ### Other Changes * perf: eliminate locks from mock invocation and verification hot paths by @thomhurst in thomhurst/TUnit#5422 * feat: TUnit0074 analyzer for redundant hook attributes on overrides by @thomhurst in thomhurst/TUnit#5459 * fix(mocks): respect generic type argument accessibility (#5453) by @thomhurst in thomhurst/TUnit#5460 * fix(mocks): skip inaccessible internal accessors when mocking Azure.Response by @thomhurst in thomhurst/TUnit#5461 * fix: apply CultureAttribute and STAThreadExecutorAttribute to hooks (#5452) by @thomhurst in thomhurst/TUnit#5463 ### Dependencies * chore(deps): update tunit to 1.29.0 by @thomhurst in thomhurst/TUnit#5446 * chore(deps): update react to ^19.2.5 by @thomhurst in thomhurst/TUnit#5457 * chore(deps): update opentelemetry to 1.15.2 by @thomhurst in thomhurst/TUnit#5456 * chore(deps): update dependency qs to v6.15.1 by @thomhurst in thomhurst/TUnit#5458 **Full Changelog**: thomhurst/TUnit@v1.29.0...v1.30.0 Commits viewable in [compare view](thomhurst/TUnit@v1.29.0...v1.30.0). </details> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=TUnit&package-manager=nuget&previous-version=1.29.0&new-version=1.30.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

thomhurst had a problem deploying to Pull Requests April 6, 2026 09:46 — with GitHub Actions Error

claude bot reviewed Apr 6, 2026

View reviewed changes

thomhurst had a problem deploying to Pull Requests April 6, 2026 09:52 — with GitHub Actions Error

thomhurst had a problem deploying to Pull Requests April 6, 2026 10:12 — with GitHub Actions Error

thomhurst temporarily deployed to Pull Requests April 6, 2026 10:12 — with GitHub Actions Inactive

claude bot reviewed Apr 6, 2026

View reviewed changes

thomhurst had a problem deploying to Pull Requests April 6, 2026 10:50 — with GitHub Actions Failure

thomhurst temporarily deployed to Pull Requests April 6, 2026 10:50 — with GitHub Actions Inactive

thomhurst had a problem deploying to Pull Requests April 6, 2026 10:50 — with GitHub Actions Failure

claude bot reviewed Apr 6, 2026

View reviewed changes

thomhurst temporarily deployed to Pull Requests April 6, 2026 11:48 — with GitHub Actions Inactive

claude bot reviewed Apr 6, 2026

View reviewed changes

thomhurst added 6 commits April 7, 2026 19:45

perf: use System.Threading.Lock for CallRecordBuffer per-buffer lock

51fb8b6

Consistent with MockEngine; faster on .NET 9+ and better diagnostics.

thomhurst force-pushed the perf/mock-engine-lock-free-hot-path branch from 9e56739 to 51fb8b6 Compare April 7, 2026 18:59

thomhurst temporarily deployed to Pull Requests April 7, 2026 18:59 — with GitHub Actions Inactive

claude bot reviewed Apr 7, 2026

View reviewed changes

thomhurst merged commit 630182a into main Apr 7, 2026
14 of 15 checks passed

thomhurst deleted the perf/mock-engine-lock-free-hot-path branch April 7, 2026 20:23

claude bot mentioned this pull request Apr 9, 2026

chore(deps): update tunit to 1.30.0 #5469

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: eliminate locks from mock invocation and verification hot paths#5422

perf: eliminate locks from mock invocation and verification hot paths#5422
thomhurst merged 6 commits intomainfrom
perf/mock-engine-lock-free-hot-path

thomhurst commented Apr 6, 2026

Uh oh!

codacy-production bot commented Apr 6, 2026 •

edited

Loading

Uh oh!

claude bot left a comment

Uh oh!

claude bot commented Apr 6, 2026

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thomhurst commented Apr 6, 2026

Summary

Test plan

Uh oh!

codacy-production bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Not up to standards ⛔

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review: perf/mock-engine-lock-free-hot-path

🚨 Critical: Bounds mismatch between calls and counts arrays

🚨 Critical: "Optimistic mark + unmark" corrupts previously-verified records

⚠️ Significant: CAS race in AddBehavior can leave behaviors split across both fields

ℹ️ Minor: lock(this) in CallRecordBuffer

ℹ️ Minor: _callsByMemberId non-volatile field now accessed without lock

What's good

Uh oh!

claude bot commented Apr 6, 2026

Code Review: perf: eliminate locks from mock invocation and verification hot paths

1. GetSnapshot() has a torn read hazard in CallRecordBuffer

2. AddBehavior CAS has a silent data loss window

3. _callsByMemberId is no longer volatile but is read lock-free in several places

4. Optimistic mark-then-unmark in WasCalled has observable side effects

5. GetAllCalls and GetUnverifiedCalls are no longer protected by a lock

Minor / Design suggestions

Summary

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review: perf/mock-engine-lock-free-hot-path

Summary

Positive Aspects

Issues Found

1. GetSnapshot read order is inverted — potential torn read on Grow (Medium)

2. CAS fast path in AddBehavior has a double-dispatch window (Low)

3. CallRecordBuffer uses object lock instead of Lock — inconsistent (Minor)

4. Duplicate <summary> XML doc in CallRecordBuffer.cs (Minor)

5. _matchers.Length == 0 guard in CountMatchingBuffer is always false at that call site (Cosmetic)

Summary Table

Recommendation

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Follow-up Review: perf/mock-engine-lock-free-hot-path

Issues Addressed Since Prior Reviews

Remaining Issue 1: GetSnapshot() read order is still the Grow-unsafe order (Medium)

Remaining Issue 2: Duplicate <summary> XML doc tag (Minor)

Remaining Issue 3: _matchers.Length == 0 guard in CountMatchingBuffer is dead code (Cosmetic)

What Looks Good

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Follow-up Review: perf/mock-engine-lock-free-hot-path (latest commit: 9e56739)

All Previously Raised Issues — Now Resolved

One Remaining Minor Point

What Looks Good

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Follow-up Review: perf/mock-engine-lock-free-hot-path (final pass)

All Previously Raised Issues — Now Resolved

Code Correctness Verification

One Very Minor Observation (not a blocker)

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codacy-production bot commented Apr 6, 2026 •

edited

Loading

🚨 Critical: Bounds mismatch between `calls` and `counts` arrays

⚠️ Significant: CAS race in `AddBehavior` can leave behaviors split across both fields

ℹ️ Minor: `lock(this)` in `CallRecordBuffer`

ℹ️ Minor: `_callsByMemberId` non-volatile field now accessed without lock

1. `GetSnapshot()` has a torn read hazard in `CallRecordBuffer`

2. `AddBehavior` CAS has a silent data loss window

3. `_callsByMemberId` is no longer `volatile` but is read lock-free in several places

4. Optimistic mark-then-unmark in `WasCalled` has observable side effects

5. `GetAllCalls` and `GetUnverifiedCalls` are no longer protected by a lock

1. `GetSnapshot` read order is inverted — potential torn read on Grow (Medium)

2. CAS fast path in `AddBehavior` has a double-dispatch window (Low)

3. `CallRecordBuffer` uses `object` lock instead of `Lock` — inconsistent (Minor)

4. Duplicate `<summary>` XML doc in `CallRecordBuffer.cs` (Minor)

5. `_matchers.Length == 0` guard in `CountMatchingBuffer` is always false at that call site (Cosmetic)

Remaining Issue 1: `GetSnapshot()` read order is still the Grow-unsafe order (Medium)

Remaining Issue 2: Duplicate `<summary>` XML doc tag (Minor)

Remaining Issue 3: `_matchers.Length == 0` guard in `CountMatchingBuffer` is dead code (Cosmetic)

Follow-up Review: perf/mock-engine-lock-free-hot-path (latest commit: `9e56739`)