fix: prevent prompt_async race condition on idle sessions by aadilshaikh123 · Pull Request #21528 · anomalyco/opencode

aadilshaikh123 · 2026-04-08T14:55:20Z

Issue for this PR

Type of change

Bug fix
New feature
Refactor / code improvement
Documentation

What does this PR do?

Fix a race condition in prompt_async where messages sent to idle sessions are created but not reliably acted upon.

Problem: When prompt_async with noReply: false is called on an idle session, the message is stored in history but the session doesn't wake up to respond. This is critical for async communication systems (like relay-mesh multi-agent).

Root cause: The runner state machine can transition to Idle and delete itself from the runners Map while concurrently a new prompt_async message arrives and creates a new runner to process it. The new loop exits before detecting the newly added message.

Solution: Before exiting the session loop, check if new user messages exist beyond the last assistant message. If found, continue the loop to process them. This catches straggler messages that arrived during the transition window.

How did you verify your code works?

Traced the full execution path:

Analyzed runner state machine transitions between Idle/Running states
Identified the race condition window in onIdle cleanup vs getRunner() creation
Verified the fix detects messages created during loop transitions
Confirmed the fix handles multiple concurrent prompt_async calls correctly
Checked that normal loop exits still work (no infinite loops)

Checklist

I have tested my changes locally
I have not included unrelated changes in this PR

Add a check before exiting the session loop to detect if new messages arrived while the loop was running. This handles a race condition where prompt_async messages on idle sessions were created but not reliably acted upon. The issue occurs when: 1. Session completes work and transitions to idle (runner cleanup start) 2. Concurrently, prompt_async with noReply: false creates a message and calls loop() 3. A new runner is created to process the message 4. But the loop exits before detecting the newly created message The fix checks if new user messages exist beyond the last assistant message before exiting the loop. If found, continues the loop to process them. This ensures messages from prompt_async reliably trigger assistant responses even when the session is idle, which is critical for async communication patterns like relay-mesh multi-agent systems. Related issues: - Background agents ignore initial prompt - stuck until manually messaged - Race condition between session cancel and Todo Continuation / Question dismiss - TUI doesn't render messages from prompt_async endpoint - /session/status not reporting properly after prompt_async

- Move mindContext from dynamicSystem to stableSystem so it is cached at BP1 (saves 500-2000+ tokens/turn on sessions with SessionMind context) - Split failureContext into stableFailures (prior turns, cached at BP1) and dynamicFailures (current turn only) to avoid re-sending stable failure history - Use signature-based dedup (`tool:error_prefix`) so the formatted stable block never changes between turns, preventing cache invalidation on accumulation - Add resetFailureSnapshot and resetEnvDynamicSent export functions for cleanup - Preserve stableSystemCount accuracy after adding stableFailures to stableSystem - Fix prompt_async idle race condition: before breaking the loop, check if a new user message arrived while we were running (PR anomalyco#21528)

Core cache optimizations: - Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn cached at BP1 for sessions with SessionMind context) - Split failureContext into stableFailures (prior turns, BP1 cached) and dynamicFailures (current turn only) using signature-based dedup - Add markLargeToolResults() pre-pass: cache_control on tool-result content parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude - Fix stale parts reference bug in markLargeToolResults for multi-tool messages - Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase quality->dimension->fallback compression prevents 5MB API limit errors - Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup - prompt_async idle race condition fix: check new messages before loop break Upstream PR cherry-picks: - PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss - PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction - PR anomalyco#21507: session processor single-flight summary dedup improvements - PR anomalyco#21528: prompt_async idle wakeup race condition fix - PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm New tools (PR anomalyco#21399): - ContextUsageTool (check_context_usage): real-time token/cache usage reporting - NewSessionTool (new_session): TUI-only, abort + create new session - TuiEvent.SessionNew bus event and app.tsx handler - SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type Test infrastructure: - E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+ - Unit tests for large-tool cache breakpoints (4 scenarios) - Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all()) - Add await to all ProviderTransform.message() call sites (now async)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent prompt_async race condition on idle sessions#21528

fix: prevent prompt_async race condition on idle sessions#21528
aadilshaikh123 wants to merge 1 commit intoanomalyco:devfrom
aadilshaikh123:bugfix/prompt-async-idle-wakeup

aadilshaikh123 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

aadilshaikh123 commented Apr 8, 2026

Issue for this PR

Type of change

What does this PR do?

How did you verify your code works?

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant