Skip to content

fix(acp,tui): prevent stale Completed from hiding post-interrupt responses#416

Merged
theahura merged 3 commits intomainfrom
auto/fix-post-interrupt-hidden-response-bug-20260403-050410
Apr 3, 2026
Merged

fix(acp,tui): prevent stale Completed from hiding post-interrupt responses#416
theahura merged 3 commits intomainfrom
auto/fix-post-interrupt-hidden-response-bug-20260403-050410

Conversation

@theahura
Copy link
Copy Markdown
Contributor

@theahura theahura commented Apr 3, 2026

Summary

🤖 Generated with Nori

  • Fix race condition where a stale TurnLifecycle::Completed from a cancelled background task prematurely terminates the next turn, causing agent responses to be silently hidden after ESC interrupt
  • Add turn_interrupted AtomicBool flag in ACP backend to suppress stale Completed emission
  • Add pending_stale_completes counter in TUI ChatWidget as defense-in-depth
  • Apply same fix to handle_compact which had the identical race

Test Plan

  • Unit tests proving the race condition (stale Completed blocks tool events for next turn)
  • Multi-interrupt test verifying counter drains correctly
  • Full TUI test suite passes (1181 tests)
  • ACP test suite passes (439 tests)
  • E2E verification via tmux: interrupt → follow-up message → response appears
  • Clippy and fmt clean

Share Nori with your team: https://www.npmjs.com/package/nori-skillsets

theahura and others added 3 commits April 3, 2026 11:54
…-interrupt responses

When a user presses ESC to interrupt a running task, the ACP backend
emits TurnLifecycle::Aborted immediately. However, the background tokio
task that was handling the prompt continues and unconditionally emits
TurnLifecycle::Completed. If the user submits a new message before this
stale Completed arrives, it races with the new turn's Started event,
setting turn_finished=true and causing tool events to be silently
discarded.

Fix with defense in depth at two levels:

Backend (ACP): Add turn_interrupted AtomicBool flag. Set on
Op::Interrupt, reset on new handle_user_input. The spawned task checks
the flag before emitting Completed and skips it if the turn was
interrupted.

TUI (ChatWidget): Add pending_stale_completes counter. Each
on_interrupted_turn increments it; on_task_complete decrements and skips
processing while the counter is positive.
🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
@theahura theahura merged commit d9e183e into main Apr 3, 2026
3 checks passed
@theahura theahura deleted the auto/fix-post-interrupt-hidden-response-bug-20260403-050410 branch April 3, 2026 16:36
theahura added a commit that referenced this pull request Apr 6, 2026
The previous approach used a shared AtomicBool flag to suppress stale
TurnLifecycle::Completed events from cancelled tasks. This had an
unavoidable TOCTOU race: when a user interrupted and quickly sent a new
message, the flag was reset for the new turn before the old task could
check it, allowing stale events to interfere with subsequent turns.

Two prior attempts (PRs #416, #418) tried to fix this with variations
of the same boolean+counter architecture. Each fix introduced a new bug:
#416 caused hangs, #418 brought back hidden responses.

This commit replaces the architecture entirely:
- AtomicBool → AtomicU64 monotonic turn counter in ACP backend
- Each spawned task captures its own turn ID at spawn time
- ALL tail events (ErrorEvent + Completed + idle timer) are guarded by
  a single turn_id match check
- TUI's pending_stale_completes counter is removed (no longer needed)

The monotonic counter eliminates the race because it only increments—
there is no "reset" that a stale task can observe.
🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
theahura added a commit that referenced this pull request Apr 6, 2026
The previous approach used a shared AtomicBool flag to suppress stale
TurnLifecycle::Completed events from cancelled tasks. This had an
unavoidable TOCTOU race: when a user interrupted and quickly sent a new
message, the flag was reset for the new turn before the old task could
check it, allowing stale events to interfere with subsequent turns.

Two prior attempts (PRs #416, #418) tried to fix this with variations
of the same boolean+counter architecture. Each fix introduced a new bug:
#416 caused hangs, #418 brought back hidden responses.

This commit replaces the architecture entirely:
- AtomicBool → AtomicU64 monotonic turn counter in ACP backend
- Each spawned task captures its own turn ID at spawn time
- ALL tail events (ErrorEvent + Completed + idle timer) are guarded by
  a single turn_id match check
- TUI's pending_stale_completes counter is removed (no longer needed)

The monotonic counter eliminates the race because it only increments—
there is no "reset" that a stale task can observe.
🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant