Skip to content

fix(tui): prevent post-interrupt hang when ACP suppresses stale Completed#418

Merged
theahura merged 1 commit intomainfrom
auto/keen-ash-20260404-073119
Apr 4, 2026
Merged

fix(tui): prevent post-interrupt hang when ACP suppresses stale Completed#418
theahura merged 1 commit intomainfrom
auto/keen-ash-20260404-073119

Conversation

@theahura
Copy link
Copy Markdown
Contributor

@theahura theahura commented Apr 4, 2026

Summary

🤖 Generated with Nori

  • Fix infinite hang after interrupting (Escape) a running task and submitting a new message
  • Root cause: dual suppression conflict between ACP backend's turn_interrupted flag (suppresses stale Completed) and TUI's pending_stale_completes counter (expects to consume it). The counter was never drained, so the next real Completed was consumed as stale.
  • Reset pending_stale_completes = 0 in on_task_started so orphaned counters from backend-suppressed events don't block subsequent turns

Test Plan

  • Reproduced bug in tmux with custom slow ACP agent (spinner hung at 1m 12s+)
  • Verified fix in tmux: interrupt → new message → response completes normally
  • Unit tests updated (part7.rs) to reflect ACP suppression behavior
  • All 1192 TUI tests pass
  • All 409 ACP tests pass
  • All 6 E2E tests pass
  • just fix and just fmt clean

Share Nori with your team: https://www.npmjs.com/package/nori-skillsets

…nterrupt hang

When a user interrupts (Escape) and then submits a new message, the TUI
hangs forever with the spinner running indefinitely.

The ACP backend suppresses stale Completed events via the turn_interrupted
flag, but the TUI's on_interrupted_turn increments pending_stale_completes
expecting to consume that event. Since the backend suppresses it, the
counter is never drained, and the next real Completed is consumed as stale.

Reset the counter to 0 in on_task_started so orphaned counters from
backend-suppressed events don't block subsequent turns.
🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
@theahura theahura merged commit bcc5e3a into main Apr 4, 2026
3 checks passed
@theahura theahura deleted the auto/keen-ash-20260404-073119 branch April 4, 2026 14:43
theahura added a commit that referenced this pull request Apr 6, 2026
The previous approach used a shared AtomicBool flag to suppress stale
TurnLifecycle::Completed events from cancelled tasks. This had an
unavoidable TOCTOU race: when a user interrupted and quickly sent a new
message, the flag was reset for the new turn before the old task could
check it, allowing stale events to interfere with subsequent turns.

Two prior attempts (PRs #416, #418) tried to fix this with variations
of the same boolean+counter architecture. Each fix introduced a new bug:
#416 caused hangs, #418 brought back hidden responses.

This commit replaces the architecture entirely:
- AtomicBool → AtomicU64 monotonic turn counter in ACP backend
- Each spawned task captures its own turn ID at spawn time
- ALL tail events (ErrorEvent + Completed + idle timer) are guarded by
  a single turn_id match check
- TUI's pending_stale_completes counter is removed (no longer needed)

The monotonic counter eliminates the race because it only increments—
there is no "reset" that a stale task can observe.
🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
theahura added a commit that referenced this pull request Apr 6, 2026
The previous approach used a shared AtomicBool flag to suppress stale
TurnLifecycle::Completed events from cancelled tasks. This had an
unavoidable TOCTOU race: when a user interrupted and quickly sent a new
message, the flag was reset for the new turn before the old task could
check it, allowing stale events to interfere with subsequent turns.

Two prior attempts (PRs #416, #418) tried to fix this with variations
of the same boolean+counter architecture. Each fix introduced a new bug:
#416 caused hangs, #418 brought back hidden responses.

This commit replaces the architecture entirely:
- AtomicBool → AtomicU64 monotonic turn counter in ACP backend
- Each spawned task captures its own turn ID at spawn time
- ALL tail events (ErrorEvent + Completed + idle timer) are guarded by
  a single turn_id match check
- TUI's pending_stale_completes counter is removed (no longer needed)

The monotonic counter eliminates the race because it only increments—
there is no "reset" that a stale task can observe.
🤖 Generated with [Nori](https://noriagentic.com)

Co-Authored-By: Nori <contact@tilework.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant