Description
When an LLM generates a tool call whose JSON arguments exceed maxOutputTokens, the output is truncated mid-JSON. opencode misclassifies this as a generic "invalid tool call," provides no truncation signal to the model, and either silently exits the session loop or enters an unrecoverable retry cycle. There is no truncation detection or recovery mechanism anywhere in the pipeline.
This affects any tool with large string parameters (write, apply_patch, edit, bash), but is most commonly triggered by write when creating new files.
Root Cause Chain (5 failure points, all interacting)
1. Artificially low output cap wastes model capacity
OUTPUT_TOKEN_MAX = 32_000 (transform.ts:21), so maxOutputTokens(model) = Math.min(model.limit.output, OUTPUT_TOKEN_MAX) caps Claude Opus (128k output) at just 32k tokens — only 25% of the model's actual capacity. The env var OPENCODE_EXPERIMENTAL_OUTPUT_TOKEN_MAX (#5679) exists but defaults to 32k.
2. Thinking budget competes with output tokens (no coordination)
Anthropic thinking budget (up to 16k tokens, transform.ts:542) and output share the same token pool. If thinking consumes 14k tokens, only 18k remain for output, but maxOutputTokens still claims 32k. Compaction (compaction.ts:44) reserves maxOutputTokens without subtracting thinking budget. Truncation occurs even for files well under the nominal limit.
3. experimental_repairToolCall cannot distinguish truncation from invalid tools (llm.ts:181-201)
When JSON is truncated, toolName is typically valid (e.g. "write") — only the args JSON is incomplete. But the code routes ALL parse failures to the generic "invalid" tool. Key signal being ignored: if failed.toolCall.toolName is a registered tool, it is almost certainly truncation, not an invalid tool.
4. finishReason "length" treated as normal completion — session exits silently (prompt.ts:698-699)
"length" is NOT in the exclusion list for modelFinished check, so the session loop breaks when the model is cut off mid-tool-call.
5. Doom loop detection fails on truncation, blocks legitimate retries (processor.ts:233-244)
Detection requires exact JSON.stringify match of inputs. Truncation produces varying incomplete JSON each attempt so doom loop is never detected. But if the model retries identically during chunked recovery, the doom loop fires on the 3rd attempt, blocking legitimate recovery.
Additional: doom_loop Permission Hangs Indefinitely in Sub-Agent Sessions
When doom loop IS detected in a sub-agent, it triggers PermissionNext.ask (processor.ts:246). This creates an Effect Deferred and awaits it (permission/service.ts:173) with no timeout.
In sub-agent sessions there is no TUI to answer the permission prompt. The Deferred.await blocks forever. The parent agent is stuck at await SessionPrompt.prompt (task.ts:129) with no timeout, no polling, no watchdog. The abort signal from the parent only covers the LLM stream (processor.ts:61-91), not Effect fibers running the permission check.
Result: parent agent blocks for 20-30 minutes until an external stale task poller force-cancels the session.
The same pattern exists in the Question service (question/service.ts) — any ask-type permission or question in a sub-agent can trigger infinite hangs.
The Cascade
LLM generates write(path, 40KB_content) then maxOutputTokens=32000 truncates JSON mid-string. AI SDK cannot parse JSON so calls experimental_repairToolCall. toolName="write" is valid but code does not check, routes to "invalid". InvalidTool returns generic error. Model has NO signal output was truncated, retries from scratch. Truncated again at different point, "invalid" again (doom loop inputs differ so no match).
Two possible outcomes:
- A) finishReason="length" exits session silently, tools marked "Tool execution aborted"
- B) doom_loop triggers (identical retry) then PermissionNext.ask hangs forever in sub-agents
Affected Tools
- write (content param) — Critical, most common trigger
- apply_patch (patchText param) — High, triggered during chunked recovery
- edit (oldString/newString) — Medium
- multiedit (nested array) — Medium
- bash (command) — Low
Proposed Fix (prioritized checklist)
- P0: Detect truncation in repairToolCall — When failed.toolCall.toolName is a registered tool but JSON parsing fails, return "output truncated by token limit, split into smaller operations" instead of generic "invalid." (llm.ts:181-201)
- P1: Auto-continue on finishReason "length" — Add "length" to the exclusion list in prompt.ts:698-699 so the session loop continues. Cap auto-continues per turn.
- P2: Coordinate thinking budget with output limit — Subtract thinking budget from maxOutputTokens so the system does not promise 32k when only 18k is available.
- P3: Fix doom_loop permission hang in sub-agents — Either set doom_loop to "deny" for sub-agents (auto-reject instead of hang), or add timeout to Deferred.await in permission service. Also fix abort signal propagation to cover permission/question Effect fibers.
- P4 (optional): Raise OUTPUT_TOKEN_MAX from 32k to 64k. Reduces truncation frequency but does not fix recovery. Do after P0-P1.
Related Issues
Directly related: #13102, #17471, #14087, #17750, #12716
Tangentially related: #17578, #17019, #18037
Previously fixed (context): #5679, #10995, #2976
Steps to reproduce
- Start a session with Claude Opus (or any model with maxOutputTokens effectively capped at 32k)
- Ask the agent to create a 35KB+ markdown file
- The model calls write(filePath, content) and the tool call JSON is truncated at maxOutputTokens
- Observe: experimental_repairToolCall routes to InvalidTool, model retries, truncated again, session exits with "Tool execution aborted" or silently stops
- For the sub-agent hang: run the same scenario via a task() sub-agent call and observe the parent blocks indefinitely
Description
When an LLM generates a tool call whose JSON arguments exceed maxOutputTokens, the output is truncated mid-JSON. opencode misclassifies this as a generic "invalid tool call," provides no truncation signal to the model, and either silently exits the session loop or enters an unrecoverable retry cycle. There is no truncation detection or recovery mechanism anywhere in the pipeline.
This affects any tool with large string parameters (write, apply_patch, edit, bash), but is most commonly triggered by write when creating new files.
Root Cause Chain (5 failure points, all interacting)
1. Artificially low output cap wastes model capacity
OUTPUT_TOKEN_MAX = 32_000 (transform.ts:21), so maxOutputTokens(model) = Math.min(model.limit.output, OUTPUT_TOKEN_MAX) caps Claude Opus (128k output) at just 32k tokens — only 25% of the model's actual capacity. The env var OPENCODE_EXPERIMENTAL_OUTPUT_TOKEN_MAX (#5679) exists but defaults to 32k.
2. Thinking budget competes with output tokens (no coordination)
Anthropic thinking budget (up to 16k tokens, transform.ts:542) and output share the same token pool. If thinking consumes 14k tokens, only 18k remain for output, but maxOutputTokens still claims 32k. Compaction (compaction.ts:44) reserves maxOutputTokens without subtracting thinking budget. Truncation occurs even for files well under the nominal limit.
3. experimental_repairToolCall cannot distinguish truncation from invalid tools (llm.ts:181-201)
When JSON is truncated, toolName is typically valid (e.g. "write") — only the args JSON is incomplete. But the code routes ALL parse failures to the generic "invalid" tool. Key signal being ignored: if failed.toolCall.toolName is a registered tool, it is almost certainly truncation, not an invalid tool.
4. finishReason "length" treated as normal completion — session exits silently (prompt.ts:698-699)
"length" is NOT in the exclusion list for modelFinished check, so the session loop breaks when the model is cut off mid-tool-call.
5. Doom loop detection fails on truncation, blocks legitimate retries (processor.ts:233-244)
Detection requires exact JSON.stringify match of inputs. Truncation produces varying incomplete JSON each attempt so doom loop is never detected. But if the model retries identically during chunked recovery, the doom loop fires on the 3rd attempt, blocking legitimate recovery.
Additional: doom_loop Permission Hangs Indefinitely in Sub-Agent Sessions
When doom loop IS detected in a sub-agent, it triggers PermissionNext.ask (processor.ts:246). This creates an Effect Deferred and awaits it (permission/service.ts:173) with no timeout.
In sub-agent sessions there is no TUI to answer the permission prompt. The Deferred.await blocks forever. The parent agent is stuck at await SessionPrompt.prompt (task.ts:129) with no timeout, no polling, no watchdog. The abort signal from the parent only covers the LLM stream (processor.ts:61-91), not Effect fibers running the permission check.
Result: parent agent blocks for 20-30 minutes until an external stale task poller force-cancels the session.
The same pattern exists in the Question service (question/service.ts) — any ask-type permission or question in a sub-agent can trigger infinite hangs.
The Cascade
LLM generates write(path, 40KB_content) then maxOutputTokens=32000 truncates JSON mid-string. AI SDK cannot parse JSON so calls experimental_repairToolCall. toolName="write" is valid but code does not check, routes to "invalid". InvalidTool returns generic error. Model has NO signal output was truncated, retries from scratch. Truncated again at different point, "invalid" again (doom loop inputs differ so no match).
Two possible outcomes:
Affected Tools
Proposed Fix (prioritized checklist)
Related Issues
Directly related: #13102, #17471, #14087, #17750, #12716
Tangentially related: #17578, #17019, #18037
Previously fixed (context): #5679, #10995, #2976
Steps to reproduce