Skip to content

fix(opencode): deterministic queued message wrapping for prompt cache#21535

Open
Sathvik-1007 wants to merge 1 commit intoanomalyco:devfrom
Sathvik-1007:fix/queued-message-cache-invalidation
Open

fix(opencode): deterministic queued message wrapping for prompt cache#21535
Sathvik-1007 wants to merge 1 commit intoanomalyco:devfrom
Sathvik-1007:fix/queued-message-cache-invalidation

Conversation

@Sathvik-1007
Copy link
Copy Markdown

@Sathvik-1007 Sathvik-1007 commented Apr 8, 2026

Issue for this PR

Fixes #21518

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

prompt.ts:1483-1499 wrapped queued user message text in <system-reminder> tags via in-memory mutation (p.text = [...]). Never persisted to DB. Each turn re-fetches from DB via hydrate() (JSON.parse) — gets original unwrapped text. step resets to 0 per runLoop call, so step > 1 is false on first step of new turn. Same message appears unwrapped → provider prompt cache miss.

Bug confirmed in v1.3.17, v1.4.0, and current dev HEAD — identical code at all three.

Fix: Remove in-memory mutation from prompt.ts. Add remind option to toModelMessages in message-v2.ts. When remind (= lastFinished?.id) is set, user text parts in messages with id > remind get wrapped during serialization. No mutation of stored parts. Same inputs = same output always = cache consistent.

How did you verify your code works?

  • Reproduced bug locally: built message history with finished assistant turn + queued user message, called toModelMessages without remind (old step 1) then with remind (old step 2+) — confirmed serialization differs (the bug). With fix applied (remind always passed), output identical across calls.
  • 246 session tests pass, 0 fail, 650 assertions (bun test test/session/)
  • TypeScript check clean (tsc + tsgo via turbo pre-push hook)
  • Bug verified at v1.3.17 (line 1481), v1.4.0 (line 1483), dev HEAD (line 1483)

Screenshots / recordings

N/A — logic change, no UI.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

@Sathvik-1007 Sathvik-1007 force-pushed the fix/queued-message-cache-invalidation branch 2 times, most recently from c998041 to 8f82457 Compare April 8, 2026 16:01
@Sathvik-1007
Copy link
Copy Markdown
Author

CI e2e failures unrelated to this PR. our change touches only prompt.ts and message-v2.ts (message serialization). failing tests: session-model-persistence (timeout), terminal-reconnect (pty assertion), terminal-tabs (stale buffer state). zero terminal/UI/playwright code changed. ECONNRESET in settings test confirms flaky CI runner. request rerun when convenient.

fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
Core cache optimizations:
- Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn
  cached at BP1 for sessions with SessionMind context)
- Split failureContext into stableFailures (prior turns, BP1 cached) and
  dynamicFailures (current turn only) using signature-based dedup
- Add markLargeToolResults() pre-pass: cache_control on tool-result content
  parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude
- Fix stale parts reference bug in markLargeToolResults for multi-tool messages
- Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase
  quality->dimension->fallback compression prevents 5MB API limit errors
- Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup
- prompt_async idle race condition fix: check new messages before loop break

Upstream PR cherry-picks:
- PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss
- PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction
- PR anomalyco#21507: session processor single-flight summary dedup improvements
- PR anomalyco#21528: prompt_async idle wakeup race condition fix
- PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm

New tools (PR anomalyco#21399):
- ContextUsageTool (check_context_usage): real-time token/cache usage reporting
- NewSessionTool (new_session): TUI-only, abort + create new session
- TuiEvent.SessionNew bus event and app.tsx handler
- SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type

Test infrastructure:
- E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+
- Unit tests for large-tool cache breakpoints (4 scenarios)
- Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all())
- Add await to all ProviderTransform.message() call sites (now async)
@Sathvik-1007 Sathvik-1007 force-pushed the fix/queued-message-cache-invalidation branch from bb60b2b to c7f6aff Compare April 8, 2026 20:14
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
…rchestrator, multi-credential, codebase indexer

Core Features:
- Session Mind with persistent memory across sessions
- Orchestrator + Worker subagent architecture
- Multi-credential OAuth with auto-refresh
- Codebase indexer and watcher connectors
- Footer status bar with live metrics

Cache & Prompt Optimizations:
- Move mindContext/failureContext to stable system prefix (BP1 cached)
- Large tool result cache_control breakpoints (>7000 chars)
- Deterministic message wrapping (PR anomalyco#21535)
- Tool evidence digest through compaction (PR anomalyco#21492)
- O(1) queue dequeue + single-flight summary (PR anomalyco#21507)
- Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500)
- Three-phase image auto-compression (PR anomalyco#21371)
- ContextUsage and NewSession tools (PR anomalyco#21399)
- E2E cache integration tests with real Anthropic OAuth

Session snapshot resets prevent memory leaks on session delete.
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
…rchestrator, multi-credential, codebase indexer

Core Features:
- Session Mind with persistent memory across sessions
- Orchestrator + Worker subagent architecture
- Multi-credential OAuth with auto-refresh
- Codebase indexer and watcher connectors
- Footer status bar with live metrics

Cache & Prompt Optimizations:
- Move mindContext/failureContext to stable system prefix (BP1 cached)
- Large tool result cache_control breakpoints (>7000 chars)
- Deterministic message wrapping (PR anomalyco#21535)
- Tool evidence digest through compaction (PR anomalyco#21492)
- O(1) queue dequeue + single-flight summary (PR anomalyco#21507)
- Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500)
- Three-phase image auto-compression (PR anomalyco#21371)
- ContextUsage and NewSession tools (PR anomalyco#21399)
- E2E cache integration tests with real Anthropic OAuth

Session snapshot resets prevent memory leaks on session delete.
fairyhunter13 pushed a commit to fairyhunter13/opencode that referenced this pull request Apr 8, 2026
…rchestrator, multi-credential, codebase indexer

Core Features:
- Session Mind with persistent memory across sessions
- Orchestrator + Worker subagent architecture
- Multi-credential OAuth with auto-refresh
- Codebase indexer and watcher connectors
- Footer status bar with live metrics

Cache & Prompt Optimizations:
- Move mindContext/failureContext to stable system prefix (BP1 cached)
- Large tool result cache_control breakpoints (>7000 chars)
- Deterministic message wrapping (PR anomalyco#21535)
- Tool evidence digest through compaction (PR anomalyco#21492)
- O(1) queue dequeue + single-flight summary (PR anomalyco#21507)
- Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500)
- Three-phase image auto-compression (PR anomalyco#21371)
- ContextUsage and NewSession tools (PR anomalyco#21399)
- E2E cache integration tests with real Anthropic OAuth

Session snapshot resets prevent memory leaks on session delete.
@Sathvik-1007 Sathvik-1007 force-pushed the fix/queued-message-cache-invalidation branch from c7f6aff to ec6261f Compare April 9, 2026 04:37
…che consistency

Queued user message text was wrapped in <system-reminder> via in-memory
mutation (prompt.ts:1483-1499) that never persisted to DB. Next turn
re-fetched from DB, got unwrapped text, broke provider prompt cache.

Move wrapping into toModelMessages with a 'remind' option so the same
messages always serialize identically regardless of step counter.

Fixes anomalyco#21518
@Sathvik-1007 Sathvik-1007 force-pushed the fix/queued-message-cache-invalidation branch from ec6261f to 20a973f Compare April 9, 2026 05:37
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 9, 2026
Core cache optimizations:
- Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn
  cached at BP1 for sessions with SessionMind context)
- Split failureContext into stableFailures (prior turns, BP1 cached) and
  dynamicFailures (current turn only) using signature-based dedup
- Add markLargeToolResults() pre-pass: cache_control on tool-result content
  parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude
- Fix stale parts reference bug in markLargeToolResults for multi-tool messages
- Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase
  quality->dimension->fallback compression prevents 5MB API limit errors
- Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup
- prompt_async idle race condition fix: check new messages before loop break

Upstream PR cherry-picks:
- PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss
- PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction
- PR anomalyco#21507: session processor single-flight summary dedup improvements
- PR anomalyco#21528: prompt_async idle wakeup race condition fix
- PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm

New tools (PR anomalyco#21399):
- ContextUsageTool (check_context_usage): real-time token/cache usage reporting
- NewSessionTool (new_session): TUI-only, abort + create new session
- TuiEvent.SessionNew bus event and app.tsx handler
- SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type

Test infrastructure:
- E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+
- Unit tests for large-tool cache breakpoints (4 scenarios)
- Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all())
- Add await to all ProviderTransform.message() call sites (now async)
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 10, 2026
… cache hit-rate maximization

## Subagent Policy Refactor (packages/opencode/src/provider/provider.ts)

Introduce SUBAGENT_PROVIDER_POLICY as a declarative per-provider allow/deny
map for subagent model selection, replacing hardcoded Anthropic branching in
resolveSubagentModel. Future providers plug in via single map entries without
touching algorithm code. Universalizes the normal algorithm path so non-policy
providers (openai, google, xai, etc.) use fair dynamic tier-based selection
without Anthropic leakage.

Provider changes:
- Add SUBAGENT_PROVIDER_POLICY with anthropic entry (allow claude-sonnet-4-6)
- Add lookupSubagentPolicy() with proxy detection for claude-* via bedrock/vertex
- Add passesPolicy() helper (deny-before-allow precedence, undefined-safe)
- Refactor resolveSubagentModel Branches 1/2/4 to use policy lookup
- Universalize normal algorithm: remove Anthropic-specific leakage from
  Branch 3/4, use cost comparison instead of includes('opus') for tiered log
- Raise FLOOR_CEILINGS.critical to 10 (unlocks premium non-Anthropic models;
  Anthropic path returns early via policy, unaffected)
- Fix misleading comment in resolveSubagentVariant about Perplexity suppression
- Mark SUBAGENT_MODEL_ALLOWLIST @deprecated (superseded by policy map)

## Task Tool Changes (packages/opencode/src/tool/task.ts)

- Raise maxRetries from 2 to 3 for full low->critical escalation ladder
- Surface rate-limit silent downgrade: log.warn with reason 'itpm-headroom-low'
  and output note when ITPM headroom forces complexity to low
- Document clampComplexity agent-ceiling bypass (JSDoc + inline escalation comment)
- Improve Agent.get error for unknown subagent_type: list available agents

## Tests

- Add packages/opencode/test/provider/variant-picker.test.ts (32 tests):
  table-driven matrix: 7 variant configs x 4 complexity tiers
- Extend packages/opencode/test/tool/task-escalation.test.ts to 76 tests with
  SUBAGENT_PROVIDER_POLICY coverage: passesPolicy unit tests, policy shape
  assertions, resolveSubagentModel routing (opus/sonnet/haiku), proxy
  detection for bedrock/vertex, openai/google/xai non-policy passthrough,
  FLOOR_CEILINGS.critical=10 assertions

## Cache Hit-Rate Maximization (feat(cache))

Port and verification of upstream cache optimizations plus new E2E validation
through real Anthropic OAuth subscription path.

- Move git/PR guidance from bash.txt (uncached BP2 tool schema) to
  anthropic.txt (cached BP1 system) -- saves ~161 bytes per request
  repeated forever (ref: anomalyco#21345)
- Add E2E integration test cache-max-efficiency-e2e.test.ts exercising
  the real OAuth subscription path with hafizludyanto credential.
  Uses full OAuth plumbing (Authorization Bearer, ?beta=true query,
  claude-cli/2.1.2 user-agent, oauth-2025-04-20 beta, Claude Code
  system prefix). 11 tests validate cache_creation, cache_read,
  hit rate >=85%, extended-cache-ttl-2025-04-11 header, no thinking
  tokens, PR anomalyco#21535 queued message regression, model lock stability.
  Measured hit rate: 93.7% on claude-sonnet-4-5-20250929 (sonnet-4-6
  returns 404 on current OAuth subscription tier).
- Add skill opencode-prompt-cache-maximization/CLAUDE.md documenting
  the 4-breakpoint strategy, 13 snapshot Maps, 8-dimension CacheGuard,
  13 CacheMonitor assertions, cache-aware compaction, and debug toggles.
- Add skill opencode-anthropic-oauth-subscription/CLAUDE.md documenting
  the OAuth subscription request envelope, token refresh flow
  (client_id 9d1c250a-e61b-44d9-88ed-5944d1962f5e), auth.json layout,
  model availability on subscription tier, and troubleshooting matrix.

Verified:
- bun typecheck clean
- variant-picker 32/32, task-escalation 76/76 (108 total, 0 fail)
- subagent-allowlist 99/106 (7 pre-existing gpt-5.2-codex fails, unchanged)
- Cache E2E 11/11, 93.7% hit rate
- 82 supporting unit tests pass (optimizations, cache-persist-snapshots,
  prompt, message-v2)

## Adaptive Thinking Guard (packages/opencode/src/session/llm.ts)

Add a defensive guard in the LLM session layer that strips
`thinking: { type: "adaptive" }` from model parameters when the selected
model does not support adaptive thinking. Non-4.6 models (opus-4-5,
sonnet-4-5, haiku-4-5, and any other non-4.6 variant) would previously
receive the adaptive thinking option through user config or agent options
and respond with "adaptive thinking is not supported on this model" errors.
The guard detects these models by checking that the model ID does not include
"4-6" and removes the adaptive thinking config before the API call is made.

- Add packages/opencode/test/provider/adaptive-thinking-guard.test.ts
  (17 tests) proving the guard strips adaptive thinking for non-4.6 models
  and preserves it for claude-*-4-6 models

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 10, 2026
… cache hit-rate maximization

## Subagent Policy Refactor (packages/opencode/src/provider/provider.ts)

Introduce SUBAGENT_PROVIDER_POLICY as a declarative per-provider allow/deny
map for subagent model selection, replacing hardcoded Anthropic branching in
resolveSubagentModel. Future providers plug in via single map entries without
touching algorithm code. Universalizes the normal algorithm path so non-policy
providers (openai, google, xai, etc.) use fair dynamic tier-based selection
without Anthropic leakage.

Provider changes:
- Add SUBAGENT_PROVIDER_POLICY with anthropic entry (allow claude-sonnet-4-6)
- Add lookupSubagentPolicy() with proxy detection for claude-* via bedrock/vertex
- Add passesPolicy() helper (deny-before-allow precedence, undefined-safe)
- Refactor resolveSubagentModel Branches 1/2/4 to use policy lookup
- Universalize normal algorithm: remove Anthropic-specific leakage from
  Branch 3/4, use cost comparison instead of includes('opus') for tiered log
- Raise FLOOR_CEILINGS.critical to 10 (unlocks premium non-Anthropic models;
  Anthropic path returns early via policy, unaffected)
- Fix misleading comment in resolveSubagentVariant about Perplexity suppression
- Mark SUBAGENT_MODEL_ALLOWLIST @deprecated (superseded by policy map)

## Task Tool Changes (packages/opencode/src/tool/task.ts)

- Raise maxRetries from 2 to 3 for full low->critical escalation ladder
- Surface rate-limit silent downgrade: log.warn with reason 'itpm-headroom-low'
  and output note when ITPM headroom forces complexity to low
- Document clampComplexity agent-ceiling bypass (JSDoc + inline escalation comment)
- Improve Agent.get error for unknown subagent_type: list available agents

## Tests

- Add packages/opencode/test/provider/variant-picker.test.ts (32 tests):
  table-driven matrix: 7 variant configs x 4 complexity tiers
- Extend packages/opencode/test/tool/task-escalation.test.ts to 76 tests with
  SUBAGENT_PROVIDER_POLICY coverage: passesPolicy unit tests, policy shape
  assertions, resolveSubagentModel routing (opus/sonnet/haiku), proxy
  detection for bedrock/vertex, openai/google/xai non-policy passthrough,
  FLOOR_CEILINGS.critical=10 assertions
- Migrate 7 E2E test files to canonical personal-auth fixture for OAuth auth:
  api-optimizations-e2e, cache-behavior-e2e, cache-drain-e2e,
  cache-monitor-e2e, cache-monitor-lifecycle-e2e, cache-optimizations-e2e,
  cache-optimizations-r1-r6-e2e
- Fix await lint warnings in api-optimizations-e2e.test.ts

## Cache Hit-Rate Maximization (feat(cache))

Port and verification of upstream cache optimizations plus new E2E validation
through real Anthropic OAuth subscription path.

- Move git/PR guidance from bash.txt (uncached BP2 tool schema) to
  anthropic.txt (cached BP1 system) -- saves ~161 bytes per request
  repeated forever (ref: anomalyco#21345)
- Add E2E integration test cache-max-efficiency-e2e.test.ts exercising
  the real OAuth subscription path with hafizludyanto credential.
  Uses full OAuth plumbing (Authorization Bearer, ?beta=true query,
  claude-cli/2.1.2 user-agent, oauth-2025-04-20 beta, Claude Code
  system prefix). 11 tests validate cache_creation, cache_read,
  hit rate >=85%, extended-cache-ttl-2025-04-11 header, no thinking
  tokens, PR anomalyco#21535 queued message regression, model lock stability.
  Measured hit rate: 93.7% on claude-sonnet-4-5-20250929 (sonnet-4-6
  returns 404 on current OAuth subscription tier).
- Add skill opencode-prompt-cache-maximization/CLAUDE.md documenting
  the 4-breakpoint strategy, 13 snapshot Maps, 8-dimension CacheGuard,
  13 CacheMonitor assertions, cache-aware compaction, and debug toggles.
- Add skill opencode-anthropic-oauth-subscription/CLAUDE.md documenting
  the OAuth subscription request envelope, token refresh flow
  (client_id 9d1c250a-e61b-44d9-88ed-5944d1962f5e), auth.json layout,
  model availability on subscription tier, and troubleshooting matrix.

Verified:
- bun typecheck clean
- variant-picker 32/32, task-escalation 76/76 (108 total, 0 fail)
- subagent-allowlist 99/106 (7 pre-existing gpt-5.2-codex fails, unchanged)
- Cache E2E 11/11, 93.7% hit rate
- 82 supporting unit tests pass (optimizations, cache-persist-snapshots,
  prompt, message-v2)

## Adaptive Thinking Guard (packages/opencode/src/session/llm.ts)

Add a defensive guard in the LLM session layer that strips
`thinking: { type: "adaptive" }` from model parameters when the selected
model does not support adaptive thinking. Non-4.6 models (opus-4-5,
sonnet-4-5, haiku-4-5, and any other non-4.6 variant) would previously
receive the adaptive thinking option through user config or agent options
and respond with "adaptive thinking is not supported on this model" errors.
The guard detects these models by checking that the model ID does not include
"4-6" and removes the adaptive thinking config before the API call is made.

- Add packages/opencode/test/provider/adaptive-thinking-guard.test.ts
  (17 tests) proving the guard strips adaptive thinking for non-4.6 models
  and preserves it for claude-*-4-6 models
- Add packages/opencode/test/fixture/personal-auth.ts as canonical OAuth
  fixture used across all E2E provider tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 10, 2026
… cache hit-rate maximization

## Subagent Policy Refactor (packages/opencode/src/provider/provider.ts)

Introduce SUBAGENT_PROVIDER_POLICY as a declarative per-provider allow/deny
map for subagent model selection, replacing hardcoded Anthropic branching in
resolveSubagentModel. Future providers plug in via single map entries without
touching algorithm code. Universalizes the normal algorithm path so non-policy
providers (openai, google, xai, etc.) use fair dynamic tier-based selection
without Anthropic leakage.

Provider changes:
- Add SUBAGENT_PROVIDER_POLICY with anthropic entry (allow claude-sonnet-4-6)
- Add lookupSubagentPolicy() with proxy detection for claude-* via bedrock/vertex
- Add passesPolicy() helper (deny-before-allow precedence, undefined-safe)
- Refactor resolveSubagentModel Branches 1/2/4 to use policy lookup
- Universalize normal algorithm: remove Anthropic-specific leakage from
  Branch 3/4, use cost comparison instead of includes('opus') for tiered log
- Raise FLOOR_CEILINGS.critical to 10 (unlocks premium non-Anthropic models;
  Anthropic path returns early via policy, unaffected)
- Fix misleading comment in resolveSubagentVariant about Perplexity suppression
- Mark SUBAGENT_MODEL_ALLOWLIST @deprecated (superseded by policy map)

## Task Tool Changes (packages/opencode/src/tool/task.ts)

- Raise maxRetries from 2 to 3 for full low->critical escalation ladder
- Surface rate-limit silent downgrade: log.warn with reason 'itpm-headroom-low'
  and output note when ITPM headroom forces complexity to low
- Document clampComplexity agent-ceiling bypass (JSDoc + inline escalation comment)
- Improve Agent.get error for unknown subagent_type: list available agents

## Tests

- Add packages/opencode/test/provider/variant-picker.test.ts (32 tests):
  table-driven matrix: 7 variant configs x 4 complexity tiers
- Extend packages/opencode/test/tool/task-escalation.test.ts to 76 tests with
  SUBAGENT_PROVIDER_POLICY coverage: passesPolicy unit tests, policy shape
  assertions, resolveSubagentModel routing (opus/sonnet/haiku), proxy
  detection for bedrock/vertex, openai/google/xai non-policy passthrough,
  FLOOR_CEILINGS.critical=10 assertions
- Migrate 7 E2E test files to canonical personal-auth fixture for OAuth auth:
  api-optimizations-e2e, cache-behavior-e2e, cache-drain-e2e,
  cache-monitor-e2e, cache-monitor-lifecycle-e2e, cache-optimizations-e2e,
  cache-optimizations-r1-r6-e2e
- Fix await lint warnings in api-optimizations-e2e.test.ts

## Cache Hit-Rate Maximization (feat(cache))

Port and verification of upstream cache optimizations plus new E2E validation
through real Anthropic OAuth subscription path.

- Move git/PR guidance from bash.txt (uncached BP2 tool schema) to
  anthropic.txt (cached BP1 system) -- saves ~161 bytes per request
  repeated forever (ref: anomalyco#21345)
- Add E2E integration test cache-max-efficiency-e2e.test.ts exercising
  the real OAuth subscription path with hafizludyanto credential.
  Uses full OAuth plumbing (Authorization Bearer, ?beta=true query,
  claude-cli/2.1.2 user-agent, oauth-2025-04-20 beta, Claude Code
  system prefix). 11 tests validate cache_creation, cache_read,
  hit rate >=85%, extended-cache-ttl-2025-04-11 header, no thinking
  tokens, PR anomalyco#21535 queued message regression, model lock stability.
  Measured hit rate: 93.7% on claude-sonnet-4-5-20250929 (sonnet-4-6
  returns 404 on current OAuth subscription tier).
- Add skill opencode-prompt-cache-maximization/CLAUDE.md documenting
  the 4-breakpoint strategy, 13 snapshot Maps, 8-dimension CacheGuard,
  13 CacheMonitor assertions, cache-aware compaction, and debug toggles.
- Add skill opencode-anthropic-oauth-subscription/CLAUDE.md documenting
  the OAuth subscription request envelope, token refresh flow
  (client_id 9d1c250a-e61b-44d9-88ed-5944d1962f5e), auth.json layout,
  model availability on subscription tier, and troubleshooting matrix.

Verified:
- bun typecheck clean
- variant-picker 32/32, task-escalation 76/76 (108 total, 0 fail)
- subagent-allowlist 99/106 (7 pre-existing gpt-5.2-codex fails, unchanged)
- Cache E2E 11/11, 93.7% hit rate
- 82 supporting unit tests pass (optimizations, cache-persist-snapshots,
  prompt, message-v2)

## Adaptive Thinking Guard (packages/opencode/src/session/llm.ts)

Add a defensive guard in the LLM session layer that strips
`thinking: { type: "adaptive" }` from model parameters when the selected
model does not support adaptive thinking. Non-4.6 models (opus-4-5,
sonnet-4-5, haiku-4-5, and any other non-4.6 variant) would previously
receive the adaptive thinking option through user config or agent options
and respond with "adaptive thinking is not supported on this model" errors.
The guard detects these models by checking that the model ID does not include
"4-6" and removes the adaptive thinking config before the API call is made.

- Add packages/opencode/test/provider/adaptive-thinking-guard.test.ts
  (17 tests) proving the guard strips adaptive thinking for non-4.6 models
  and preserves it for claude-*-4-6 models
- Add packages/opencode/test/fixture/personal-auth.ts as canonical OAuth
  fixture used across all E2E provider tests

## Auth Profile Active-Switch Guard (packages/opencode/src/auth/index.ts)

Prevent Auth.set() from silently reassigning the active profile when a
non-default entry is written. The active profile now only changes when
no active profile exists yet OR when the caller explicitly passes
`options.active === true`. Previously any Auth.set() without
`active: false` would clobber the active profile, causing subagent OAuth
auth to silently swap credentials mid-session.

## Usage-Limit Error Recognition (packages/opencode/src/provider/error.ts)

Recognize additional Claude.ai Max subscription usage-limit error signatures
so the TUI surfaces the correct "reached your Claude.ai usage limit" message
(and the opencode go upsell modal) instead of a generic API error.

- Add message regex patterns: /out.of.extra.usage/i, /extra.usage.exceeded/i,
  /usage_limit_exceeded/i
- isUsageLimit() now matches body.error.type === "usage_limit_exceeded" and
  body.error.code === "extra_usage_exceeded" in addition to the existing
  insufficient_quota / usage_not_included codes
- Upstream error mapper branches on body.error.type === "usage_limit_exceeded"
  and "extra_usage_exceeded" plus body.error.code === "extra_usage_exceeded",
  returning type: "usage_limit" with a friendly message pointing at
  claude.ai/settings/usage
- invalid_prompt handler reuses the captured errorMsg local

## Cache Optimization E2E Migration (test/e2e/cache-optimization.test.ts)

Port cache-optimization.test.ts to the canonical personal-auth fixture used
across all E2E provider tests. Removes manual auth.json reading and raw
fetch plumbing.

- Replace getToken() + raw fetch with checkPersonalAuth / setupAuth /
  teardownAuth / createOAuthFetch from test/fixture/personal-auth
- Drop OPENCODE_E2E env gate; suite uses describe.skipIf(!HAS_AUTH) so it
  runs automatically when ~/.local/share/opencode-personal/opencode/auth.json
  has valid OAuth credentials
- Drop oauth-2025-04-20 beta from test headers (fixture handles OAuth
  envelope, Bearer token refresh, claude-cli user-agent)
- Wire Auth.get('anthropic') access/expires into createOAuthFetch

## Search-Engine Submodule Bump (cmd/opencode-search-engine)

Bump cmd/opencode-search-engine: 18a207fe → 9a5b99506
- fix(indexer): hybrid search FTS + compaction blocking runtime

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 10, 2026
… cache hit-rate maximization

Introduce SUBAGENT_PROVIDER_POLICY as a declarative per-provider allow/deny
map for subagent model selection, replacing hardcoded Anthropic branching in
resolveSubagentModel. Future providers plug in via single map entries without
touching algorithm code. Universalizes the normal algorithm path so non-policy
providers (openai, google, xai, etc.) use fair dynamic tier-based selection
without Anthropic leakage.

Provider changes:
- Add SUBAGENT_PROVIDER_POLICY with anthropic entry (allow claude-sonnet-4-6)
- Add lookupSubagentPolicy() with proxy detection for claude-* via bedrock/vertex
- Add passesPolicy() helper (deny-before-allow precedence, undefined-safe)
- Refactor resolveSubagentModel Branches 1/2/4 to use policy lookup
- Universalize normal algorithm: remove Anthropic-specific leakage from
  Branch 3/4, use cost comparison instead of includes('opus') for tiered log
- Raise FLOOR_CEILINGS.critical to 10 (unlocks premium non-Anthropic models;
  Anthropic path returns early via policy, unaffected)
- Fix misleading comment in resolveSubagentVariant about Perplexity suppression
- Mark SUBAGENT_MODEL_ALLOWLIST @deprecated (superseded by policy map)

- Raise maxRetries from 2 to 3 for full low->critical escalation ladder
- Surface rate-limit silent downgrade: log.warn with reason 'itpm-headroom-low'
  and output note when ITPM headroom forces complexity to low
- Document clampComplexity agent-ceiling bypass (JSDoc + inline escalation comment)
- Improve Agent.get error for unknown subagent_type: list available agents

- Add packages/opencode/test/provider/variant-picker.test.ts (32 tests):
  table-driven matrix: 7 variant configs x 4 complexity tiers
- Extend packages/opencode/test/tool/task-escalation.test.ts to 76 tests with
  SUBAGENT_PROVIDER_POLICY coverage: passesPolicy unit tests, policy shape
  assertions, resolveSubagentModel routing (opus/sonnet/haiku), proxy
  detection for bedrock/vertex, openai/google/xai non-policy passthrough,
  FLOOR_CEILINGS.critical=10 assertions
- Migrate 7 E2E test files to canonical personal-auth fixture for OAuth auth:
  api-optimizations-e2e, cache-behavior-e2e, cache-drain-e2e,
  cache-monitor-e2e, cache-monitor-lifecycle-e2e, cache-optimizations-e2e,
  cache-optimizations-r1-r6-e2e
- Fix await lint warnings in api-optimizations-e2e.test.ts

Port and verification of upstream cache optimizations plus new E2E validation
through real Anthropic OAuth subscription path.

- Move git/PR guidance from bash.txt (uncached BP2 tool schema) to
  anthropic.txt (cached BP1 system) -- saves ~161 bytes per request
  repeated forever (ref: anomalyco#21345)
- Add E2E integration test cache-max-efficiency-e2e.test.ts exercising
  the real OAuth subscription path with hafizludyanto credential.
  Uses full OAuth plumbing (Authorization Bearer, ?beta=true query,
  claude-cli/2.1.2 user-agent, oauth-2025-04-20 beta, Claude Code
  system prefix). 11 tests validate cache_creation, cache_read,
  hit rate >=85%, extended-cache-ttl-2025-04-11 header, no thinking
  tokens, PR anomalyco#21535 queued message regression, model lock stability.
  Measured hit rate: 93.7% on claude-sonnet-4-5-20250929 (sonnet-4-6
  returns 404 on current OAuth subscription tier).
- Add skill opencode-prompt-cache-maximization/CLAUDE.md documenting
  the 4-breakpoint strategy, 13 snapshot Maps, 8-dimension CacheGuard,
  13 CacheMonitor assertions, cache-aware compaction, and debug toggles.
- Add skill opencode-anthropic-oauth-subscription/CLAUDE.md documenting
  the OAuth subscription request envelope, token refresh flow
  (client_id 9d1c250a-e61b-44d9-88ed-5944d1962f5e), auth.json layout,
  model availability on subscription tier, and troubleshooting matrix.

Verified:
- bun typecheck clean
- variant-picker 32/32, task-escalation 76/76 (108 total, 0 fail)
- subagent-allowlist 99/106 (7 pre-existing gpt-5.2-codex fails, unchanged)
- Cache E2E 11/11, 93.7% hit rate
- 82 supporting unit tests pass (optimizations, cache-persist-snapshots,
  prompt, message-v2)

Add a defensive guard in the LLM session layer that strips
`thinking: { type: "adaptive" }` from model parameters when the selected
model does not support adaptive thinking. Non-4.6 models (opus-4-5,
sonnet-4-5, haiku-4-5, and any other non-4.6 variant) would previously
receive the adaptive thinking option through user config or agent options
and respond with "adaptive thinking is not supported on this model" errors.
The guard detects these models by checking that the model ID does not include
"4-6" and removes the adaptive thinking config before the API call is made.

- Add packages/opencode/test/provider/adaptive-thinking-guard.test.ts
  (17 tests) proving the guard strips adaptive thinking for non-4.6 models
  and preserves it for claude-*-4-6 models
- Add packages/opencode/test/fixture/personal-auth.ts as canonical OAuth
  fixture used across all E2E provider tests

Prevent Auth.set() from silently reassigning the active profile when a
non-default entry is written. The active profile now only changes when
no active profile exists yet OR when the caller explicitly passes
`options.active === true`. Previously any Auth.set() without
`active: false` would clobber the active profile, causing subagent OAuth
auth to silently swap credentials mid-session.

Recognize additional Claude.ai Max subscription usage-limit error signatures
so the TUI surfaces the correct "reached your Claude.ai usage limit" message
(and the opencode go upsell modal) instead of a generic API error.

- Add message regex patterns: /out.of.extra.usage/i, /extra.usage.exceeded/i,
  /usage_limit_exceeded/i
- isUsageLimit() now matches body.error.type === "usage_limit_exceeded" and
  body.error.code === "extra_usage_exceeded" in addition to the existing
  insufficient_quota / usage_not_included codes
- Upstream error mapper branches on body.error.type === "usage_limit_exceeded"
  and "extra_usage_exceeded" plus body.error.code === "extra_usage_exceeded",
  returning type: "usage_limit" with a friendly message pointing at
  claude.ai/settings/usage
- invalid_prompt handler reuses the captured errorMsg local

Port cache-optimization.test.ts to the canonical personal-auth fixture used
across all E2E provider tests. Removes manual auth.json reading and raw
fetch plumbing.

- Replace getToken() + raw fetch with checkPersonalAuth / setupAuth /
  teardownAuth / createOAuthFetch from test/fixture/personal-auth
- Drop OPENCODE_E2E env gate; suite uses describe.skipIf(!HAS_AUTH) so it
  runs automatically when ~/.local/share/opencode-personal/opencode/auth.json
  has valid OAuth credentials
- Drop oauth-2025-04-20 beta from test headers (fixture handles OAuth
  envelope, Bearer token refresh, claude-cli user-agent)
- Wire Auth.get('anthropic') access/expires into createOAuthFetch

Bump cmd/opencode-search-engine: 18a207fe → 9a5b99506
- fix(indexer): hybrid search FTS + compaction blocking runtime

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 10, 2026
… cache hit-rate maximization

Introduce SUBAGENT_PROVIDER_POLICY as a declarative per-provider allow/deny
map for subagent model selection, replacing hardcoded Anthropic branching in
resolveSubagentModel. Future providers plug in via single map entries without
touching algorithm code. Universalizes the normal algorithm path so non-policy
providers (openai, google, xai, etc.) use fair dynamic tier-based selection
without Anthropic leakage.

Provider changes:
- Add SUBAGENT_PROVIDER_POLICY with anthropic entry (allow claude-sonnet-4-6)
- Add lookupSubagentPolicy() with proxy detection for claude-* via bedrock/vertex
- Add passesPolicy() helper (deny-before-allow precedence, undefined-safe)
- Refactor resolveSubagentModel Branches 1/2/4 to use policy lookup
- Universalize normal algorithm: remove Anthropic-specific leakage from
  Branch 3/4, use cost comparison instead of includes('opus') for tiered log
- Raise FLOOR_CEILINGS.critical to 10 (unlocks premium non-Anthropic models;
  Anthropic path returns early via policy, unaffected)
- Fix misleading comment in resolveSubagentVariant about Perplexity suppression
- Mark SUBAGENT_MODEL_ALLOWLIST @deprecated (superseded by policy map)

- Raise maxRetries from 2 to 3 for full low->critical escalation ladder
- Surface rate-limit silent downgrade: log.warn with reason 'itpm-headroom-low'
  and output note when ITPM headroom forces complexity to low
- Document clampComplexity agent-ceiling bypass (JSDoc + inline escalation comment)
- Improve Agent.get error for unknown subagent_type: list available agents

- Add packages/opencode/test/provider/variant-picker.test.ts (32 tests):
  table-driven matrix: 7 variant configs x 4 complexity tiers
- Extend packages/opencode/test/tool/task-escalation.test.ts to 76 tests with
  SUBAGENT_PROVIDER_POLICY coverage: passesPolicy unit tests, policy shape
  assertions, resolveSubagentModel routing (opus/sonnet/haiku), proxy
  detection for bedrock/vertex, openai/google/xai non-policy passthrough,
  FLOOR_CEILINGS.critical=10 assertions
- Migrate 7 E2E test files to canonical personal-auth fixture for OAuth auth:
  api-optimizations-e2e, cache-behavior-e2e, cache-drain-e2e,
  cache-monitor-e2e, cache-monitor-lifecycle-e2e, cache-optimizations-e2e,
  cache-optimizations-r1-r6-e2e
- Fix await lint warnings in api-optimizations-e2e.test.ts

Port and verification of upstream cache optimizations plus new E2E validation
through real Anthropic OAuth subscription path.

- Move git/PR guidance from bash.txt (uncached BP2 tool schema) to
  anthropic.txt (cached BP1 system) -- saves ~161 bytes per request
  repeated forever (ref: anomalyco#21345)
- Add E2E integration test cache-max-efficiency-e2e.test.ts exercising
  the real OAuth subscription path with hafizludyanto credential.
  Uses full OAuth plumbing (Authorization Bearer, ?beta=true query,
  claude-cli/2.1.2 user-agent, oauth-2025-04-20 beta, Claude Code
  system prefix). 11 tests validate cache_creation, cache_read,
  hit rate >=85%, extended-cache-ttl-2025-04-11 header, no thinking
  tokens, PR anomalyco#21535 queued message regression, model lock stability.
  Measured hit rate: 93.7% on claude-sonnet-4-5-20250929 (sonnet-4-6
  returns 404 on current OAuth subscription tier).
- Add skill opencode-prompt-cache-maximization/CLAUDE.md documenting
  the 4-breakpoint strategy, 13 snapshot Maps, 8-dimension CacheGuard,
  13 CacheMonitor assertions, cache-aware compaction, and debug toggles.
- Add skill opencode-anthropic-oauth-subscription/CLAUDE.md documenting
  the OAuth subscription request envelope, token refresh flow
  (client_id 9d1c250a-e61b-44d9-88ed-5944d1962f5e), auth.json layout,
  model availability on subscription tier, and troubleshooting matrix.

Verified:
- bun typecheck clean
- variant-picker 32/32, task-escalation 76/76 (108 total, 0 fail)
- subagent-allowlist 99/106 (7 pre-existing gpt-5.2-codex fails, unchanged)
- Cache E2E 11/11, 93.7% hit rate
- 82 supporting unit tests pass (optimizations, cache-persist-snapshots,
  prompt, message-v2)

Add a defensive guard in the LLM session layer that strips
`thinking: { type: "adaptive" }` from model parameters when the selected
model does not support adaptive thinking. Non-4.6 models (opus-4-5,
sonnet-4-5, haiku-4-5, and any other non-4.6 variant) would previously
receive the adaptive thinking option through user config or agent options
and respond with "adaptive thinking is not supported on this model" errors.
The guard detects these models by checking that the model ID does not include
"4-6" and removes the adaptive thinking config before the API call is made.

- Add packages/opencode/test/provider/adaptive-thinking-guard.test.ts
  (17 tests) proving the guard strips adaptive thinking for non-4.6 models
  and preserves it for claude-*-4-6 models
- Add packages/opencode/test/fixture/personal-auth.ts as canonical OAuth
  fixture used across all E2E provider tests

Prevent Auth.set() from silently reassigning the active profile when a
non-default entry is written. The active profile now only changes when
no active profile exists yet OR when the caller explicitly passes
`options.active === true`. Previously any Auth.set() without
`active: false` would clobber the active profile, causing subagent OAuth
auth to silently swap credentials mid-session.

Recognize additional Claude.ai Max subscription usage-limit error signatures
so the TUI surfaces the correct "reached your Claude.ai usage limit" message
(and the opencode go upsell modal) instead of a generic API error.

- Add message regex patterns: /out.of.extra.usage/i, /extra.usage.exceeded/i,
  /usage_limit_exceeded/i
- isUsageLimit() now matches body.error.type === "usage_limit_exceeded" and
  body.error.code === "extra_usage_exceeded" in addition to the existing
  insufficient_quota / usage_not_included codes
- Upstream error mapper branches on body.error.type === "usage_limit_exceeded"
  and "extra_usage_exceeded" plus body.error.code === "extra_usage_exceeded",
  returning type: "usage_limit" with a friendly message pointing at
  claude.ai/settings/usage
- invalid_prompt handler reuses the captured errorMsg local

Port cache-optimization.test.ts to the canonical personal-auth fixture used
across all E2E provider tests. Removes manual auth.json reading and raw
fetch plumbing.

- Replace getToken() + raw fetch with checkPersonalAuth / setupAuth /
  teardownAuth / createOAuthFetch from test/fixture/personal-auth
- Drop OPENCODE_E2E env gate; suite uses describe.skipIf(!HAS_AUTH) so it
  runs automatically when ~/.local/share/opencode-personal/opencode/auth.json
  has valid OAuth credentials
- Drop oauth-2025-04-20 beta from test headers (fixture handles OAuth
  envelope, Bearer token refresh, claude-cli user-agent)
- Wire Auth.get('anthropic') access/expires into createOAuthFetch

Bump cmd/opencode-search-engine: 18a207fe → 9a5b99506
- fix(indexer): hybrid search FTS + compaction blocking runtime

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 10, 2026
… cache hit-rate maximization

Introduce SUBAGENT_PROVIDER_POLICY as a declarative per-provider allow/deny
map for subagent model selection, replacing hardcoded Anthropic branching in
resolveSubagentModel. Future providers plug in via single map entries without
touching algorithm code. Universalizes the normal algorithm path so non-policy
providers (openai, google, xai, etc.) use fair dynamic tier-based selection
without Anthropic leakage.

Provider changes:
- Add SUBAGENT_PROVIDER_POLICY with anthropic entry (allow claude-sonnet-4-6)
- Add lookupSubagentPolicy() with proxy detection for claude-* via bedrock/vertex
- Add passesPolicy() helper (deny-before-allow precedence, undefined-safe)
- Refactor resolveSubagentModel Branches 1/2/4 to use policy lookup
- Universalize normal algorithm: remove Anthropic-specific leakage from
  Branch 3/4, use cost comparison instead of includes('opus') for tiered log
- Raise FLOOR_CEILINGS.critical to 10 (unlocks premium non-Anthropic models;
  Anthropic path returns early via policy, unaffected)
- Fix misleading comment in resolveSubagentVariant about Perplexity suppression
- Mark SUBAGENT_MODEL_ALLOWLIST @deprecated (superseded by policy map)

- Raise maxRetries from 2 to 3 for full low->critical escalation ladder
- Surface rate-limit silent downgrade: log.warn with reason 'itpm-headroom-low'
  and output note when ITPM headroom forces complexity to low
- Document clampComplexity agent-ceiling bypass (JSDoc + inline escalation comment)
- Improve Agent.get error for unknown subagent_type: list available agents

- Add packages/opencode/test/provider/variant-picker.test.ts (32 tests):
  table-driven matrix: 7 variant configs x 4 complexity tiers
- Extend packages/opencode/test/tool/task-escalation.test.ts to 76 tests with
  SUBAGENT_PROVIDER_POLICY coverage: passesPolicy unit tests, policy shape
  assertions, resolveSubagentModel routing (opus/sonnet/haiku), proxy
  detection for bedrock/vertex, openai/google/xai non-policy passthrough,
  FLOOR_CEILINGS.critical=10 assertions
- Migrate 7 E2E test files to canonical personal-auth fixture for OAuth auth:
  api-optimizations-e2e, cache-behavior-e2e, cache-drain-e2e,
  cache-monitor-e2e, cache-monitor-lifecycle-e2e, cache-optimizations-e2e,
  cache-optimizations-r1-r6-e2e
- Fix await lint warnings in api-optimizations-e2e.test.ts

Port and verification of upstream cache optimizations plus new E2E validation
through real Anthropic OAuth subscription path.

- Move git/PR guidance from bash.txt (uncached BP2 tool schema) to
  anthropic.txt (cached BP1 system) -- saves ~161 bytes per request
  repeated forever (ref: anomalyco#21345)
- Add E2E integration test cache-max-efficiency-e2e.test.ts exercising
  the real OAuth subscription path with hafizludyanto credential.
  Uses full OAuth plumbing (Authorization Bearer, ?beta=true query,
  claude-cli/2.1.2 user-agent, oauth-2025-04-20 beta, Claude Code
  system prefix). 11 tests validate cache_creation, cache_read,
  hit rate >=85%, extended-cache-ttl-2025-04-11 header, no thinking
  tokens, PR anomalyco#21535 queued message regression, model lock stability.
  Measured hit rate: 93.7% on claude-sonnet-4-5-20250929 (sonnet-4-6
  returns 404 on current OAuth subscription tier).
- Add skill opencode-prompt-cache-maximization/CLAUDE.md documenting
  the 4-breakpoint strategy, 13 snapshot Maps, 8-dimension CacheGuard,
  13 CacheMonitor assertions, cache-aware compaction, and debug toggles.
- Add skill opencode-anthropic-oauth-subscription/CLAUDE.md documenting
  the OAuth subscription request envelope, token refresh flow
  (client_id 9d1c250a-e61b-44d9-88ed-5944d1962f5e), auth.json layout,
  model availability on subscription tier, and troubleshooting matrix.

Verified:
- bun typecheck clean
- variant-picker 32/32, task-escalation 76/76 (108 total, 0 fail)
- subagent-allowlist 99/106 (7 pre-existing gpt-5.2-codex fails, unchanged)
- Cache E2E 11/11, 93.7% hit rate
- 82 supporting unit tests pass (optimizations, cache-persist-snapshots,
  prompt, message-v2)

Add a defensive guard in the LLM session layer that strips
`thinking: { type: "adaptive" }` from model parameters when the selected
model does not support adaptive thinking. Non-4.6 models (opus-4-5,
sonnet-4-5, haiku-4-5, and any other non-4.6 variant) would previously
receive the adaptive thinking option through user config or agent options
and respond with "adaptive thinking is not supported on this model" errors.
The guard detects these models by checking that the model ID does not include
"4-6" and removes the adaptive thinking config before the API call is made.

- Add packages/opencode/test/provider/adaptive-thinking-guard.test.ts
  (17 tests) proving the guard strips adaptive thinking for non-4.6 models
  and preserves it for claude-*-4-6 models
- Add packages/opencode/test/fixture/personal-auth.ts as canonical OAuth
  fixture used across all E2E provider tests

Prevent Auth.set() from silently reassigning the active profile when a
non-default entry is written. The active profile now only changes when
no active profile exists yet OR when the caller explicitly passes
`options.active === true`. Previously any Auth.set() without
`active: false` would clobber the active profile, causing subagent OAuth
auth to silently swap credentials mid-session.

Recognize additional Claude.ai Max subscription usage-limit error signatures
so the TUI surfaces the correct "reached your Claude.ai usage limit" message
(and the opencode go upsell modal) instead of a generic API error.

- Add message regex patterns: /out.of.extra.usage/i, /extra.usage.exceeded/i,
  /usage_limit_exceeded/i
- isUsageLimit() now matches body.error.type === "usage_limit_exceeded" and
  body.error.code === "extra_usage_exceeded" in addition to the existing
  insufficient_quota / usage_not_included codes
- Upstream error mapper branches on body.error.type === "usage_limit_exceeded"
  and "extra_usage_exceeded" plus body.error.code === "extra_usage_exceeded",
  returning type: "usage_limit" with a friendly message pointing at
  claude.ai/settings/usage
- invalid_prompt handler reuses the captured errorMsg local

Port cache-optimization.test.ts to the canonical personal-auth fixture used
across all E2E provider tests. Removes manual auth.json reading and raw
fetch plumbing.

- Replace getToken() + raw fetch with checkPersonalAuth / setupAuth /
  teardownAuth / createOAuthFetch from test/fixture/personal-auth
- Drop OPENCODE_E2E env gate; suite uses describe.skipIf(!HAS_AUTH) so it
  runs automatically when ~/.local/share/opencode-personal/opencode/auth.json
  has valid OAuth credentials
- Drop oauth-2025-04-20 beta from test headers (fixture handles OAuth
  envelope, Bearer token refresh, claude-cli user-agent)
- Wire Auth.get('anthropic') access/expires into createOAuthFetch

Add hafizludyanto-sonnet-comparison.test.ts provider integration test
that resolves the hafizludyanto OAuth token from the opencode-personal
auth.json and compares Anthropic /v1/messages responses for
claude-sonnet-4-6 with and without the ?beta=true query flag.
Auto-skips via describe.skipIf when no OAuth token is present, so
the suite is a no-op on CI. Used to diagnose usage_limit_exceeded
behavior on the Extra Usage subscription path.

Bump cmd/opencode-search-engine: 18a207fe → 9a5b99506
- fix(indexer): hybrid search FTS + compaction blocking runtime

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fairyhunter13 added a commit to fairyhunter13/opencode that referenced this pull request Apr 10, 2026
… cache hit-rate maximization

Introduce SUBAGENT_PROVIDER_POLICY as a declarative per-provider allow/deny
map for subagent model selection, replacing hardcoded Anthropic branching in
resolveSubagentModel. Future providers plug in via single map entries without
touching algorithm code. Universalizes the normal algorithm path so non-policy
providers (openai, google, xai, etc.) use fair dynamic tier-based selection
without Anthropic leakage.

Provider changes:
- Add SUBAGENT_PROVIDER_POLICY with anthropic entry (allow claude-sonnet-4-6)
- Add lookupSubagentPolicy() with proxy detection for claude-* via bedrock/vertex
- Add passesPolicy() helper (deny-before-allow precedence, undefined-safe)
- Refactor resolveSubagentModel Branches 1/2/4 to use policy lookup
- Universalize normal algorithm: remove Anthropic-specific leakage from
  Branch 3/4, use cost comparison instead of includes('opus') for tiered log
- Raise FLOOR_CEILINGS.critical to 10 (unlocks premium non-Anthropic models;
  Anthropic path returns early via policy, unaffected)
- Fix misleading comment in resolveSubagentVariant about Perplexity suppression
- Mark SUBAGENT_MODEL_ALLOWLIST @deprecated (superseded by policy map)

- Raise maxRetries from 2 to 3 for full low->critical escalation ladder
- Surface rate-limit silent downgrade: log.warn with reason 'itpm-headroom-low'
  and output note when ITPM headroom forces complexity to low
- Document clampComplexity agent-ceiling bypass (JSDoc + inline escalation comment)
- Improve Agent.get error for unknown subagent_type: list available agents

- Add packages/opencode/test/provider/variant-picker.test.ts (32 tests):
  table-driven matrix: 7 variant configs x 4 complexity tiers
- Extend packages/opencode/test/tool/task-escalation.test.ts to 76 tests with
  SUBAGENT_PROVIDER_POLICY coverage: passesPolicy unit tests, policy shape
  assertions, resolveSubagentModel routing (opus/sonnet/haiku), proxy
  detection for bedrock/vertex, openai/google/xai non-policy passthrough,
  FLOOR_CEILINGS.critical=10 assertions
- Migrate 7 E2E test files to canonical personal-auth fixture for OAuth auth:
  api-optimizations-e2e, cache-behavior-e2e, cache-drain-e2e,
  cache-monitor-e2e, cache-monitor-lifecycle-e2e, cache-optimizations-e2e,
  cache-optimizations-r1-r6-e2e
- Fix await lint warnings in api-optimizations-e2e.test.ts

Port and verification of upstream cache optimizations plus new E2E validation
through real Anthropic OAuth subscription path.

- Move git/PR guidance from bash.txt (uncached BP2 tool schema) to
  anthropic.txt (cached BP1 system) -- saves ~161 bytes per request
  repeated forever (ref: anomalyco#21345)
- Add E2E integration test cache-max-efficiency-e2e.test.ts exercising
  the real OAuth subscription path with hafizludyanto credential.
  Uses full OAuth plumbing (Authorization Bearer, ?beta=true query,
  claude-cli/2.1.2 user-agent, oauth-2025-04-20 beta, Claude Code
  system prefix). 11 tests validate cache_creation, cache_read,
  hit rate >=85%, extended-cache-ttl-2025-04-11 header, no thinking
  tokens, PR anomalyco#21535 queued message regression, model lock stability.
  Measured hit rate: 93.7% on claude-sonnet-4-5-20250929 (sonnet-4-6
  returns 404 on current OAuth subscription tier).
- Add skill opencode-prompt-cache-maximization/CLAUDE.md documenting
  the 4-breakpoint strategy, 13 snapshot Maps, 8-dimension CacheGuard,
  13 CacheMonitor assertions, cache-aware compaction, and debug toggles.
- Add skill opencode-anthropic-oauth-subscription/CLAUDE.md documenting
  the OAuth subscription request envelope, token refresh flow
  (client_id 9d1c250a-e61b-44d9-88ed-5944d1962f5e), auth.json layout,
  model availability on subscription tier, and troubleshooting matrix.

Verified:
- bun typecheck clean
- variant-picker 32/32, task-escalation 76/76 (108 total, 0 fail)
- subagent-allowlist 99/106 (7 pre-existing gpt-5.2-codex fails, unchanged)
- Cache E2E 11/11, 93.7% hit rate
- 82 supporting unit tests pass (optimizations, cache-persist-snapshots,
  prompt, message-v2)

Add a defensive guard in the LLM session layer that strips
`thinking: { type: "adaptive" }` from model parameters when the selected
model does not support adaptive thinking. Non-4.6 models (opus-4-5,
sonnet-4-5, haiku-4-5, and any other non-4.6 variant) would previously
receive the adaptive thinking option through user config or agent options
and respond with "adaptive thinking is not supported on this model" errors.
The guard detects these models by checking that the model ID does not include
"4-6" and removes the adaptive thinking config before the API call is made.

- Add packages/opencode/test/provider/adaptive-thinking-guard.test.ts
  (17 tests) proving the guard strips adaptive thinking for non-4.6 models
  and preserves it for claude-*-4-6 models
- Add packages/opencode/test/fixture/personal-auth.ts as canonical OAuth
  fixture used across all E2E provider tests

Prevent Auth.set() from silently reassigning the active profile when a
non-default entry is written. The active profile now only changes when
no active profile exists yet OR when the caller explicitly passes
`options.active === true`. Previously any Auth.set() without
`active: false` would clobber the active profile, causing subagent OAuth
auth to silently swap credentials mid-session.

Recognize additional Claude.ai Max subscription usage-limit error signatures
so the TUI surfaces the correct "reached your Claude.ai usage limit" message
(and the opencode go upsell modal) instead of a generic API error.

- Add message regex patterns: /out.of.extra.usage/i, /extra.usage.exceeded/i,
  /usage_limit_exceeded/i
- isUsageLimit() now matches body.error.type === "usage_limit_exceeded" and
  body.error.code === "extra_usage_exceeded" in addition to the existing
  insufficient_quota / usage_not_included codes
- Upstream error mapper branches on body.error.type === "usage_limit_exceeded"
  and "extra_usage_exceeded" plus body.error.code === "extra_usage_exceeded",
  returning type: "usage_limit" with a friendly message pointing at
  claude.ai/settings/usage
- invalid_prompt handler reuses the captured errorMsg local

Port cache-optimization.test.ts to the canonical personal-auth fixture used
across all E2E provider tests. Removes manual auth.json reading and raw
fetch plumbing.

- Replace getToken() + raw fetch with checkPersonalAuth / setupAuth /
  teardownAuth / createOAuthFetch from test/fixture/personal-auth
- Drop OPENCODE_E2E env gate; suite uses describe.skipIf(!HAS_AUTH) so it
  runs automatically when ~/.local/share/opencode-personal/opencode/auth.json
  has valid OAuth credentials
- Drop oauth-2025-04-20 beta from test headers (fixture handles OAuth
  envelope, Bearer token refresh, claude-cli user-agent)
- Wire Auth.get('anthropic') access/expires into createOAuthFetch

Add hafizludyanto-sonnet-comparison.test.ts provider integration test
that resolves the hafizludyanto OAuth token from the opencode-personal
auth.json and compares Anthropic /v1/messages responses for
claude-sonnet-4-6 with and without the ?beta=true query flag.
Auto-skips via describe.skipIf when no OAuth token is present, so
the suite is a no-op on CI. Used to diagnose usage_limit_exceeded
behavior on the Extra Usage subscription path.

Bump cmd/opencode-search-engine: 18a207fe → 9a5b99506
- fix(indexer): hybrid search FTS + compaction blocking runtime

Fix ReferenceError: revert is not defined in the session TUI route.
Two `const revert` declarations coexisted in the same component body:
an inner `const revert = session()?.revert?.messageID` inside an
onSelect callback (line 574) shadowed the top-level
`const revert = createMemo(...)` (line 1070). SolidJS's babel-plugin-solid
scope analysis captured the wrong binding in JSX memo closures, producing
a TDZ-style ReferenceError at runtime whenever the revert UI rendered.

- Rename top-level createMemo from `revert` to `revertState`
- Update 6 JSX usages (lines 1550, 1584, 1589, 1591, 1611x2) to revertState()
- Inner callback-local `const revert` preserved (no shadow conflict)

Regenerate packages/sdk/js/src/v2/gen/{sdk,types}.gen.ts to fix
pre-existing TS2305/TS2724 errors in @opencode-ai/plugin typecheck
(sdk.gen.ts imported AuthListResponses, EventTuiSessionNew,
IndexHealthResponses and others that no longer existed in types.gen.ts).

Add extractLocalDefs() pass in packages/sdk/js/script/build.ts: Zod 4's
z.lazy() emits $defs with __schemaX keys whose $refs point to
#/components/schemas/__schemaX (absolute) while the $defs themselves
are schema-local. Lift all local $defs into components.schemas before
createClient() so hey-api can resolve them. Without this the build
fails with: Missing $ref pointer "#/components/schemas/__schema0".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Queued user messages are serialized inconsistently across turns and break prompt caching

1 participant