Infinite retry loop when StreamIdleTimeoutError occurs during tool input generation

## Summary

When a model attempts to generate a large tool input (e.g., writing a full page of content), the stream can stall and trigger a `StreamIdleTimeoutError`. This error is marked as retryable, causing an infinite loop where the model repeatedly attempts the same failing operation.

## Reproduction

1. Ask the agent to write a large file (e.g., "create a full product page with multiple sections")
2. The model starts generating a `write` tool call with large `content` parameter
3. The API stalls during tool input generation (possibly due to rate limiting or output token limits)
4. After 60 seconds, `StreamIdleTimeoutError` is thrown
5. Error is retried with exponential backoff
6. Model sees previous attempt failed with "Tool execution aborted"
7. Model tries the exact same approach
8. Loop continues indefinitely

## Evidence from Logs

```
ERROR 2026-02-05T02:51:10 service=session.processor error=Stream idle timeout: no data received for 60000ms
ERROR 2026-02-05T02:52:16 service=session.processor error=Stream idle timeout: no data received for 60000ms
ERROR 2026-02-05T02:53:23 service=session.processor error=Stream idle timeout: no data received for 60000ms
ERROR 2026-02-05T02:54:34 service=session.processor error=Stream idle timeout: no data received for 60000ms
```

Task verification showed 10 consecutive write attempts with empty inputs:
```
## Tools Used
write: {}
write: {}
write: {}
write: {}
write: {}
write: {}
write: {}
write: {}
write: {}
write: {}
```

## Root Cause Analysis

### The Retry Loop

```
Model generates Write tool with large content
    ↓
API stalls during tool input JSON generation
    ↓
60 seconds pass with no stream chunks
    ↓
StreamIdleTimeoutError thrown (processor.ts:44)
    ↓
Converted to APIError with isRetryable: true (message-v2.ts:715)
    ↓
retry.ts.retryable() returns message string
    ↓
processor.ts catches, increments attempt, waits, continues (line 403-420)
    ↓
New LLM.stream() starts fresh
    ↓
Model sees "Tool execution aborted" error, tries same approach
    ↓
INFINITE LOOP
```

### Why Doom Loop Detection Doesn't Trigger

The existing doom loop detection (processor.ts:207-232) checks:
```typescript
if (part.state.status === "running" && part.state.input) {
  // Track same tool + same input called 3 times
}
```

But this fails because:
1. Stream dies during `tool-input-start` phase (before `tool-call`)
2. Tool never reaches "running" status
3. Input is always `{}` (empty) - JSON never completed
4. Cleanup marks tool as "error" with empty input
5. Each retry has a different tool call ID, so not detected as duplicate

## Suggested Fixes

### Option 1: Add max retries for StreamIdleTimeoutError

```typescript
// In processor.ts
let idleTimeoutRetries = 0
const MAX_IDLE_TIMEOUT_RETRIES = 3

// In catch block:
if (e instanceof StreamIdleTimeoutError) {
  idleTimeoutRetries++
  if (idleTimeoutRetries >= MAX_IDLE_TIMEOUT_RETRIES) {
    input.assistantMessage.error = MessageV2.fromError(
      new Error("Stream repeatedly timed out. The model may be trying to generate too much content at once."),
      { providerID: input.model.providerID }
    )
    return "stop"
  }
}
```

### Option 2: Detect repeated incomplete tool calls

```typescript
// Track tools that fail during input generation
const incompleteToolAttempts: Record<string, number> = {}

// When tool-input-start fires but stream dies before tool-call:
if (part.type === "tool" && Object.keys(part.state.input || {}).length === 0) {
  incompleteToolAttempts[part.tool] = (incompleteToolAttempts[part.tool] || 0) + 1
  if (incompleteToolAttempts[part.tool] >= DOOM_LOOP_THRESHOLD) {
    // Trigger doom loop - stop and surface error
  }
}
```

### Option 3: Better error message to help model recover

Instead of generic "Tool execution aborted", provide actionable guidance:
```
"Tool execution aborted: stream timed out after 60s while generating tool input. 
This often happens when writing very large content. Try breaking the write into smaller chunks."
```

## Environment

- Provider: github-copilot
- Model: claude-opus-4.5
- Stream idle timeout: 60000ms (default)
- Tool: write

## Related Code

- `packages/opencode/src/session/processor.ts` - Stream processing, idle timeout, doom loop detection
- `packages/opencode/src/session/message-v2.ts` - StreamIdleTimeoutError class, error conversion
- `packages/opencode/src/session/retry.ts` - Retry logic
- `packages/opencode/src/session/prompt.ts` - Main agentic loop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinite retry loop when StreamIdleTimeoutError occurs during tool input generation #12233

Summary

Reproduction

Evidence from Logs

Root Cause Analysis

The Retry Loop

Why Doom Loop Detection Doesn't Trigger

Suggested Fixes

Option 1: Add max retries for StreamIdleTimeoutError

Option 2: Detect repeated incomplete tool calls

Option 3: Better error message to help model recover

Environment

Related Code

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Infinite retry loop when StreamIdleTimeoutError occurs during tool input generation #12233

Description

Summary

Reproduction

Evidence from Logs

Root Cause Analysis

The Retry Loop

Why Doom Loop Detection Doesn't Trigger

Suggested Fixes

Option 1: Add max retries for StreamIdleTimeoutError

Option 2: Detect repeated incomplete tool calls

Option 3: Better error message to help model recover

Environment

Related Code

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions