fix(opencode): preserve tool context through compaction and prompt loops#21492
Open
GuestAUser wants to merge 7 commits intoanomalyco:devfrom
Open
fix(opencode): preserve tool context through compaction and prompt loops#21492GuestAUser wants to merge 7 commits intoanomalyco:devfrom
GuestAUser wants to merge 7 commits intoanomalyco:devfrom
Conversation
Memoize stable system prompt fragments across multi-step loop iterations so tool-call continuations stop rebuilding the same environment and skill text, while still reloading instruction files each step for correctness.
Contributor
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
7ad70dd to
dad51a1
Compare
fairyhunter13
added a commit
to fairyhunter13/opencode
that referenced
this pull request
Apr 8, 2026
Core cache optimizations: - Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn cached at BP1 for sessions with SessionMind context) - Split failureContext into stableFailures (prior turns, BP1 cached) and dynamicFailures (current turn only) using signature-based dedup - Add markLargeToolResults() pre-pass: cache_control on tool-result content parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude - Fix stale parts reference bug in markLargeToolResults for multi-tool messages - Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase quality->dimension->fallback compression prevents 5MB API limit errors - Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup - prompt_async idle race condition fix: check new messages before loop break Upstream PR cherry-picks: - PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss - PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction - PR anomalyco#21507: session processor single-flight summary dedup improvements - PR anomalyco#21528: prompt_async idle wakeup race condition fix - PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm New tools (PR anomalyco#21399): - ContextUsageTool (check_context_usage): real-time token/cache usage reporting - NewSessionTool (new_session): TUI-only, abort + create new session - TuiEvent.SessionNew bus event and app.tsx handler - SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type Test infrastructure: - E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+ - Unit tests for large-tool cache breakpoints (4 scenarios) - Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all()) - Add await to all ProviderTransform.message() call sites (now async)
fairyhunter13
added a commit
to fairyhunter13/opencode
that referenced
this pull request
Apr 8, 2026
…rchestrator, multi-credential, codebase indexer Core Features: - Session Mind with persistent memory across sessions - Orchestrator + Worker subagent architecture - Multi-credential OAuth with auto-refresh - Codebase indexer and watcher connectors - Footer status bar with live metrics Cache & Prompt Optimizations: - Move mindContext/failureContext to stable system prefix (BP1 cached) - Large tool result cache_control breakpoints (>7000 chars) - Deterministic message wrapping (PR anomalyco#21535) - Tool evidence digest through compaction (PR anomalyco#21492) - O(1) queue dequeue + single-flight summary (PR anomalyco#21507) - Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500) - Three-phase image auto-compression (PR anomalyco#21371) - ContextUsage and NewSession tools (PR anomalyco#21399) - E2E cache integration tests with real Anthropic OAuth Session snapshot resets prevent memory leaks on session delete.
fairyhunter13
added a commit
to fairyhunter13/opencode
that referenced
this pull request
Apr 8, 2026
…rchestrator, multi-credential, codebase indexer Core Features: - Session Mind with persistent memory across sessions - Orchestrator + Worker subagent architecture - Multi-credential OAuth with auto-refresh - Codebase indexer and watcher connectors - Footer status bar with live metrics Cache & Prompt Optimizations: - Move mindContext/failureContext to stable system prefix (BP1 cached) - Large tool result cache_control breakpoints (>7000 chars) - Deterministic message wrapping (PR anomalyco#21535) - Tool evidence digest through compaction (PR anomalyco#21492) - O(1) queue dequeue + single-flight summary (PR anomalyco#21507) - Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500) - Three-phase image auto-compression (PR anomalyco#21371) - ContextUsage and NewSession tools (PR anomalyco#21399) - E2E cache integration tests with real Anthropic OAuth Session snapshot resets prevent memory leaks on session delete.
fairyhunter13
pushed a commit
to fairyhunter13/opencode
that referenced
this pull request
Apr 8, 2026
…rchestrator, multi-credential, codebase indexer Core Features: - Session Mind with persistent memory across sessions - Orchestrator + Worker subagent architecture - Multi-credential OAuth with auto-refresh - Codebase indexer and watcher connectors - Footer status bar with live metrics Cache & Prompt Optimizations: - Move mindContext/failureContext to stable system prefix (BP1 cached) - Large tool result cache_control breakpoints (>7000 chars) - Deterministic message wrapping (PR anomalyco#21535) - Tool evidence digest through compaction (PR anomalyco#21492) - O(1) queue dequeue + single-flight summary (PR anomalyco#21507) - Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500) - Three-phase image auto-compression (PR anomalyco#21371) - ContextUsage and NewSession tools (PR anomalyco#21399) - E2E cache integration tests with real Anthropic OAuth Session snapshot resets prevent memory leaks on session delete.
fairyhunter13
added a commit
to fairyhunter13/opencode
that referenced
this pull request
Apr 9, 2026
Core cache optimizations: - Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn cached at BP1 for sessions with SessionMind context) - Split failureContext into stableFailures (prior turns, BP1 cached) and dynamicFailures (current turn only) using signature-based dedup - Add markLargeToolResults() pre-pass: cache_control on tool-result content parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude - Fix stale parts reference bug in markLargeToolResults for multi-tool messages - Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase quality->dimension->fallback compression prevents 5MB API limit errors - Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup - prompt_async idle race condition fix: check new messages before loop break Upstream PR cherry-picks: - PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss - PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction - PR anomalyco#21507: session processor single-flight summary dedup improvements - PR anomalyco#21528: prompt_async idle wakeup race condition fix - PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm New tools (PR anomalyco#21399): - ContextUsageTool (check_context_usage): real-time token/cache usage reporting - NewSessionTool (new_session): TUI-only, abort + create new session - TuiEvent.SessionNew bus event and app.tsx handler - SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type Test infrastructure: - E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+ - Unit tests for large-tool cache breakpoints (4 scenarios) - Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all()) - Add await to all ProviderTransform.message() call sites (now async)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #20246
Type of change
What does this PR do?
This PR fixes a cluster of session-context problems in
packages/opencode/src/sessionthat showed up in multi-step conversations, compaction follow-ups, and OpenAI Responses tool continuations.At a high level, the branch now does seven things:
previous_response_idwhenstore: trueis explicitly enabled, so the continuation path can send the tool-result delta instead of replaying the full prior prompt historySystemPrompt.skills(agent)andSystemPrompt.environment(model)while still reloadinginstruction.system()every iteration so instruction edits remain visible on the very next loop stepstep-finishmetadata to the single safe field this branch actually consumes,openai.responseId, instead of persisting the full provider metadata objectglobalSync.set("project", ...)writes them back into the global store, so branch code no longer re-inserts proxy-backed project objects during app-side project updatesThe result is a branch that is both more context-preserving and more operationally stable.
Problem breakdown
1. Compacted tool calls lost too much usable context
Before this change, once older completed tool calls were compacted, later continuation loops no longer had a durable, bounded summary of what the tool had actually done. That made follow-up reasoning weaker because the model only saw that old content had been cleared.
2. OpenAI Responses follow-ups replayed too much history
When OpenAI Responses tool execution produced a follow-up iteration, the session loop could still replay the full message history even when the provider already had a stored response chain. That is both slower and less precise than threading through
previous_response_idwith only the tool output delta.3. Prompt-loop system fragments were recomputed unnecessarily
The loop was rebuilding stable prompt fragments each iteration even though only the dynamic instruction file contents actually needed to be reloaded each turn.
4. The branch needed stabilization after CI surfaced broad e2e failures
While validating the branch, CI for this PR reported broad app instability rather than one isolated selector failure:
[e2e:pageerror]events and laterfetch failed/ECONNRESETbun script/e2e-local.tsThat pointed to branch-specific runtime instability in the opencode session pipeline rather than simple UI test flake. The two most suspicious paths on this branch were:
providerMetadataontostep-finishparts even though the branch only consumesmetadata.openai.responseIdstate.metadata.evidencewithout a safe fallback for legacy compacted rowsThis PR now fixes both of those stabilization problems directly.
File-by-file changes
packages/opencode/src/session/compaction.tspackages/opencode/src/session/evidence.tspackages/opencode/src/session/message-v2.tstoModelMessages()now emits a structured evidence digest instead of the old[Old tool result content cleared]placeholder pathstate.metadata.evidenceexists and matches the expected shape, it is usedpackages/opencode/src/session/llm.tsoptstoLLM.StreamInputpreviousResponseId/storefor threaded OpenAI follow-upspackages/opencode/src/session/processor.tsstep-finish.metadataso the prompt loop can reuse provider response IDs across iterations{ openai: { responseId } }packages/opencode/src/session/prompt.tsthreaded()/chain()helpers for OpenAI Responses continuation detectionstore: trueis enabled, the last assistant matches the current provider/model, and the last assistant contains completed tool activity, the loop extracts theresponseIdfrom the laststep-finishmetadata and uses it to thread the next requestSystemPrompt.skills(agent)per agentSystemPrompt.environment(model)per modelinstruction.system()on every loop iteration so instruction changes remain live immediatelyBranch stabilization follow-up
After the original branch changes were in place, I investigated the failing CI jobs linked from this PR.
Observed CI behavior
Windows e2e
The failure pattern was broad and systemic rather than selector-specific:
[e2e:pageerror] ObjectTypeError: fetch failedwithECONNRESETThat pattern suggested backend or session runtime instability rather than a single UI bug.
Linux e2e
The job timed out after 30 minutes inside
Run app e2e tests.The log showed the backend bootstrapping path starting, but the run stalled during startup and never completed within the job timeout.
Stabilization changes added because of that analysis
Two additional commits were added on top of the original branch work:
fix(opencode): store only safe finish response metadatafix(opencode): fall back for legacy compacted tool evidenceThese changes directly target the two branch-specific risky areas described above.
Additional app-side stabilization
A later Windows-only failure still pointed at app-side global sync rather than backend session serialization alone. The strongest signal was a Solid store proxy error from
packages/app/src/context/global-sync.tsxduring project cache writes.Those follow-up fixes now ensure both sides of the project-sync path are detached:
sanitizeProject()returns a detached plain copy for persisted cache writes, andsetProjects()clones incoming project arrays before writing them back into the live global store. Together, that prevents proxy-backed project objects from one Solid store from being mirrored or reinserted across store boundaries and then surfacing asSymbol(solid-proxy)runtime failures in workspace/session/sidebar flows.Commits on this PR
fix(opencode): preserve compacted tool evidencefix(opencode): thread openai tool follow-upsperf(opencode): memoize stable prompt loop contextfix(opencode): store only safe finish response metadatafix(opencode): fall back for legacy compacted tool evidencefix(app): detach cached project state from Solid proxiesfix(app): clone project state before syncing cacheHow did you verify your code works?
I verified this locally in
packages/opencodewith both broad session regression coverage and focused stabilization checks.bun typecheckbun test test/session/message-v2.test.ts test/session/compaction.test.ts test/session/prompt-effect.test.ts test/session/processor-effect.test.tsbun typecheckbun test test/session/message-v2.test.ts test/session/prompt-effect.test.ts test/session/processor-effect.test.tsbun test test/session/message-v2.test.ts -t "replaces compacted tool output when legacy evidence metadata is missing"bun test test/session/prompt-effect.test.ts -t "openai tool continuation threads the previous response"MessageV2.toModelMessages(...), which produced output beginning with[Compacted tool result],tool: bash,title: Bash,input: {"cmd":"pwd"}sanitizeProject()/createStore()probe inpackages/appconfirming detached project copies fortime,sandboxes, and icon state before storing them in a second storebun test --preload ./happydom.ts ./src/context/global-sync/utils.test.ts ./src/context/global-sync/event-reducer.test.tscloneProject()/sanitizeProject()probe inpackages/appconfirming detached global-store and cache-store project copiesAll targeted checks above passed locally.
Why these changes are safe
Response threading
The branch only consumes
metadata.openai.responseIdfromstep-finish, so narrowing persisted metadata to the allowlisted field removes risk without removing required behavior.Prompt-loop memoization
Only stable prompt fragments are memoized.
instruction.system()still reloads on every iteration, so AGENTS/instruction changes remain visible on the next loop step.Compacted evidence fallback
The new fallback only applies when a tool part is already compacted and legacy evidence metadata is absent. In the normal compacted path, stored evidence is still preferred.
Existing semantics preserved
The branch remains scoped to session-context handling. It does not widen tool permissions, alter retry policy, or change non-session app behavior intentionally.
Trade-offs / things to watch
store: trueis explicitly enabled and the prior assistant/model relationship is safe to reuseScope note
This PR is scoped to
packages/opencode/src/session/**,packages/app/src/context/global-sync/**, and the corresponding targeted regression coverage.No unrelated console working tree changes are intended to be part of this PR.
This description intentionally keeps the repository PR-template headings verbatim so compliance automation can match them exactly.
Screenshots / recordings
N/A
Checklist