Fix parallel session crash: coalesce state notifications + dispose Process handles#337
Conversation
…ocess handles Under heavy parallel session load (5+ sessions with tool calls), the app freezes due to two compounding issues: 1. Render flood: Each session fires OnStateChanged on every SDK event (tool start, phase change, turn start). With N parallel sessions, the WebView WebContent process gets starved by rapid Blazor re-renders. Fix: Add NotifyStateChangedCoalesced() with a 150ms timer that batches rapid-fire state notifications. Applied to high-frequency, non-critical events (ToolExecutionStart phase change, AssistantTurnStart phase change, ToolExecutionComplete permission check, WsBridge state sync). Critical events (completion, errors, session switches) remain immediate. 2. Process handle leaks: ServerManager never disposed Process objects from StartServerAsync/StopServer. OS handles accumulated across reconnect cycles, contributing to 'No process is associated' exceptions. Fix: Dispose Process after stdout/stderr streams close in StartServerAsync. Add process.Dispose() in StopServer and DevTunnelService.Stop/TryHost. Fixes PureWeen#336 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
btessiau
left a comment
There was a problem hiding this comment.
🔍 PR #337 Review — Fix parallel session crash: coalesce state notifications + dispose Process handles
CI Status:
Prior reviews: None
Files changed: 5 (CopilotService.cs, CopilotService.Events.cs, CopilotService.Bridge.cs, ServerManager.cs, DevTunnelService.cs)
Findings
🟡 MODERATE — ServerManager.cs:100-104 — Sequential stdout/stderr draining can deadlock
The original code used two separate Task.Run calls draining stdout and stderr in parallel:
// BEFORE — parallel (correct)
_ = Task.Run(async () => { try { while (await process.StandardOutput.ReadLineAsync() != null) { } } catch { } });
_ = Task.Run(async () => { try { while (await process.StandardError.ReadLineAsync() != null) { } } catch { } });The new code drains them sequentially in a single task:
// AFTER — sequential (regression)
_ = Task.Run(async () =>
{
try { while (await process.StandardOutput.ReadLineAsync() != null) { } } catch { }
try { while (await process.StandardError.ReadLineAsync() != null) { } } catch { }
process.Dispose();
});For a long-running server process, ReadLineAsync() on stdout blocks until the process exits (stdout never closes while the server is alive). This means the stderr drain loop never starts while the server is running. If the server writes more than the OS pipe buffer (~64KB) to stderr — common under error conditions or verbose logging — the process blocks waiting for the pipe to drain, the drain task is stuck on stdout, and you get a deadlock.
Fix: Keep two separate tasks, use Task.WhenAll + .ContinueWith to dispose after both complete:
var t1 = Task.Run(async () => { try { while (await process.StandardOutput.ReadLineAsync() != null) { } } catch { } });
var t2 = Task.Run(async () => { try { while (await process.StandardError.ReadLineAsync() != null) { } } catch { } });
_ = Task.WhenAll(t1, t2).ContinueWith(_ => process.Dispose());What looks correct ✅
Coalescing logic (CopilotService.cs:344-361): The _stateChangedPending / Interlocked.CompareExchange pattern is correct. Only the first caller in a burst arms the timer; subsequent callers see _stateChangedPending=1 and do nothing. FireCoalescedStateChanged resets the flag before calling InvokeOnUI, so events arriving during dispatch get a new timer cycle. ObjectDisposedException is properly caught. Timer disposal is in DisposeAsync alongside the other debounce timers — consistent with existing pattern.
Coalesced call sites: The three Events.cs sites (tool start phase change, turn start phase change, permission denial) and the bridge sync are all genuinely high-frequency, non-critical updates. Critical completions, errors, session switches, and watchdog notifications correctly remain on the immediate OnStateChanged?.Invoke() path.
DevTunnelService.cs process disposal: Disposing before nulling in TryHostTunnelAsync is correct. The case where the process has already exited (previously not cleaned up) is now also handled. In Stop(), Dispose() is inside the try block with _hostProcess = null outside — if Dispose throws it's caught and the null assignment still runs.
ServerManager.cs StopServer(): Disposing the process handle after Kill() at line 142 is correct — Process.GetProcessById creates a new handle object that should be disposed.
Test coverage
The coalescer has no dedicated unit test. A test asserting that N rapid calls to NotifyStateChangedCoalesced() result in at most 1 OnStateChanged invocation within 150ms would be valuable (similar to RenderThrottleTests.cs). Not a blocker, but worth adding.
Recommendation: ⚠️ Request changes
One targeted fix needed: restore parallel stream draining in ServerManager.StartServerAsync so stderr is drained concurrently with stdout. The dispose can then be chained with Task.WhenAll. Everything else in this PR is correct and well-structured.
Restore parallel stdout/stderr draining in ServerManager.StartServerAsync. Sequential draining deadlocks if stderr fills its OS pipe buffer (~64KB) while the stdout drain blocks waiting for the long-running server to exit. Use Task.WhenAll + ContinueWith to dispose after both streams close. Add StateChangeCoalescerTests (4 tests): - Rapid calls coalesce into single notification - Single call fires exactly once - Separate bursts fire separately - Immediate NotifyStateChanged still works synchronously Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
btessiau
left a comment
There was a problem hiding this comment.
Addressed in 1a5f808:
🟡 Sequential stream drain deadlock → Fixed
Restored parallel stdout/stderr draining with Task.WhenAll + ContinueWith for disposal:
var t1 = Task.Run(async () => { try { while (await process.StandardOutput.ReadLineAsync() != null) { } } catch { } });
var t2 = Task.Run(async () => { try { while (await process.StandardError.ReadLineAsync() != null) { } } catch { } });
_ = Task.WhenAll(t1, t2).ContinueWith(_ => process.Dispose());Coalescer tests → Added
4 tests in StateChangeCoalescerTests.cs:
- 20 rapid calls coalesce into 1-3 notifications (not 20)
- Single call fires exactly once
- Separate bursts fire separately
- Immediate
NotifyStateChanged()still works synchronously
Build: 0 errors. Tests: all pass.
btessiau
left a comment
There was a problem hiding this comment.
🔄 PR #337 Re-Review (Round 2) — Fix parallel session crash: coalesce state notifications + dispose Process handles
CI Status:
Previous reviews: 2 (Round 1 flagged sequential drain deadlock; author self-commented fix in 1a5f808)
Previous Findings Status
| # | Finding | Status |
|---|---|---|
| 1 | 🟡 MODERATE — ServerManager.cs:100-104 sequential stdout/stderr drain causes deadlock |
✅ FIXED |
Finding #1 — FIXED
ServerManager.cs:102-104 now uses exactly the recommended pattern:
var t1 = Task.Run(async () => { try { while (await process.StandardOutput.ReadLineAsync() != null) { } } catch { } });
var t2 = Task.Run(async () => { try { while (await process.StandardError.ReadLineAsync() != null) { } } catch { } });
_ = Task.WhenAll(t1, t2).ContinueWith(_ => process.Dispose());Both streams drain in parallel; Dispose() is chained only after both complete. Deadlock path eliminated.
New Analysis
Coalescer implementation — ✅ Correct
CopilotService.cs:344-361 — The Interlocked.CompareExchange/Exchange pattern is correct:
- Only the first caller in a burst arms the timer (CAS 0→1); subsequent calls no-op
FireCoalescedStateChangedresets_stateChangedPendingto 0 before callingInvokeOnUI, so events arriving during dispatch get a fresh timer cycle — no notifications lostObjectDisposedExceptiononTimer.Change()is properly caughtDisposeAsync(line ~3406) disposes and nulls the timer, consistent with the existing debounce timer pattern
Call sites — ✅ Correct thread safety
Events.cs:311—NotifyStateChangedCoalesced()called directly on the SDK event thread (outsideInvoke()). Safe:CompareExchangeandTimer.Change()are both thread-safe, andFireCoalescedStateChangedusesInvokeOnUIfor the final dispatchEvents.cs:422, 452— Called insideInvoke()blocks (UI thread). SafeBridge.cs:52— Called from bridge background thread. Safe for same reason as above
DevTunnelService.cs — ✅ Correct
TryHostTunnelAsync(~line 282-285):Dispose()now happens after the if-block, covering both the HasExited and !HasExited paths. Previously a process that had already exited was never disposed._hostProcess = nullfollows immediatelyStop()(~line 462-468):Dispose()is inside the try block;_hostProcess = nullis outside at line 468, so it runs regardless of whetherDispose()throws. Correct
ServerManager.cs StopServer() — ✅ Correct
process.Dispose() after process.Kill() at line 141 is correct — Process.GetProcessById allocates a new handle object that should be freed.
Test Coverage
StateChangeCoalescerTests.cs adds 4 tests covering:
- 20 rapid calls coalesce into 1–3 notifications (not 20) —
Assert.InRange(fireCount, 1, 3)is appropriately loose for timer-based timing - Single call fires exactly once
- Separate bursts fire separately (~1 notification each)
NotifyStateChanged()(immediate path) still works synchronously
Tests are structurally sound. One note: test instances don't call DisposeAsync(), leaving the coalesce timer pending — harmless in practice but inconsistent with test hygiene. Not a blocker.
No tests for the process disposal changes (DevTunnelService, ServerManager), but these are straightforward resource cleanup and the fix is clearly correct.
Verdict: ✅ Approve
Round 1 critical finding is fixed correctly. All new code paths are logically sound:
- Coalescer correctly reduces render flood for high-frequency events while preserving immediate dispatch for critical events
- Process handle leaks are plugged in all affected code paths
- Thread safety of the coalescer is correct
No blocking issues remain.
…ocess handles (PureWeen#337) ## Problem When running many CLI sessions in parallel (5+), the app progressively freezes and crashes. ### Root causes 1. **Render flood** — Each session fires OnStateChanged on every SDK event. With N parallel sessions, the WebView WebContent process gets starved by rapid Blazor re-renders (241 updateFreezerStatus errors in 30min). 2. **Process handle leaks** — ServerManager never disposed Process objects. OS handles accumulated across reconnect cycles (468 unobserved exceptions in crash log). ## Fix ### Coalesced state notifications Added NotifyStateChangedCoalesced() — a 150ms timer-based coalescer for high-frequency, non-critical state updates. Applied to: - ToolExecutionStartEvent phase change (Working) - AssistantTurnStartEvent phase change (Thinking) - ToolExecutionCompleteEvent permission denial notification - WsBridgeClient.OnStateChanged remote session sync Critical events remain immediate: session completion, errors, session switches, watchdog. ### Process handle disposal - ServerManager.StartServerAsync: dispose Process after stdout/stderr streams close - ServerManager.StopServer: dispose after stopping - DevTunnelService.Stop/TryHostTunnelAsync: dispose before nulling ## Testing Build: 0 errors. Tests: 2378 passed (4 pre-existing failures). Fixes PureWeen#336 --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem
When running many CLI sessions in parallel (5+), the app progressively freezes and crashes.
Root causes
Render flood — Each session fires OnStateChanged on every SDK event. With N parallel sessions, the WebView WebContent process gets starved by rapid Blazor re-renders (241 updateFreezerStatus errors in 30min).
Process handle leaks — ServerManager never disposed Process objects. OS handles accumulated across reconnect cycles (468 unobserved exceptions in crash log).
Fix
Coalesced state notifications
Added NotifyStateChangedCoalesced() — a 150ms timer-based coalescer for high-frequency, non-critical state updates. Applied to:
Critical events remain immediate: session completion, errors, session switches, watchdog.
Process handle disposal
Testing
Build: 0 errors. Tests: 2378 passed (4 pre-existing failures).
Fixes #336