perf(opencode): reduce redundant summary and queue overhead by GuestAUser · Pull Request #21507 · anomalyco/opencode

GuestAUser · 2026-04-08T11:46:34Z

Summary

dedupe identical in-flight SessionSummary.summarize() calls so concurrent summary work shares one computation instead of re-reading messages and re-running snapshot diffs in parallel
replace AsyncQueue's Array.shift() dequeue path with a head-index queue to remove O(n) copies from hot SSE / event / TUI delivery paths
avoid rereading persisted tool parts during SessionProcessor cleanup by using the in-memory ctx.toolcalls map that already tracks active tool calls

Why

The current opencode hot path still pays for a few avoidable costs in high-frequency paths:

summary generation can be triggered concurrently for the same {sessionID, messageID} pair, which duplicates message hydration and diff work
queue consumers pay repeated shift() costs during sustained event delivery
processor cleanup rereads message parts that are already available in memory

None of these change the user-visible model behavior. The goal here is to shave redundant work from core loop infrastructure while preserving fault tolerance and existing tool-loop semantics.

What changed

1. Single-flight session summary work

packages/opencode/src/session/summary.ts

adds per-instance summary state through InstanceState
uses Effect Cache.make() to share one in-flight summarize operation per [sessionID, messageID]
invalidates the cache entry after completion so this remains single-flight work sharing rather than long-lived result caching
keeps the rest of the summary pipeline unchanged: session summary aggregation, stored diff write, diff event publish, and user-message summary update

packages/opencode/test/session/summary.test.ts

adds a regression test that fires two concurrent summarize calls
verifies only one message load and one diff computation happen

2. O(1) queue dequeue path

packages/opencode/src/util/queue.ts

adds a queue head index instead of removing from the front of the array with shift()
compacts the backing array only when enough items have been consumed to make compaction worthwhile
keeps ordering semantics and waiter handoff behavior unchanged

packages/opencode/test/util/queue.test.ts

verifies long ordered dequeue behavior
verifies waiting readers still receive pushed values correctly

3. Processor cleanup avoids redundant reads

packages/opencode/src/session/processor.ts

cleanup now iterates the in-memory ctx.toolcalls map rather than rereading all parts for the assistant message
only unfinished in-memory tool calls are converted to terminal error state during abort / cleanup
explicitly leaves the existing doom-loop detection semantics intact

Scope

This PR is intentionally limited to packages/opencode/** hot-path infrastructure.
It does not include the unrelated local packages/console/** changes currently present in my working tree.

Validation

Typecheck

Run from packages/opencode:

bun typecheck

Targeted tests

Run from packages/opencode:

bun test test/session/summary.test.ts test/util/queue.test.ts test/session/processor-effect.test.ts

Result:

all targeted tests passed

Manual QA

Run from packages/opencode:

bun -e 'import { AsyncQueue } from "./src/util/queue"; const q = new AsyncQueue(); q.push("a"); q.push("b"); console.log([await q.next(), await q.next()].join(","))'
- output: a,b
summary single-flight probe against SessionSummary.layer
- output: summary_messages:1
- output: summary_calls:1

Microbench checks

Run from packages/opencode:

queue comparison over 200000 push+pop operations
- shift:18.13ms
- idx:12.73ms
warmed tool-registry benchmark used during validation of the broader hot-path investigation
- warm_same:24.12ms
- vary_agent:32.38ms

Trade-offs / things to watch

summary single-flight state is bounded to 1024 keys and invalidated after completion, so it should not become a persistent result cache
queue compaction trades occasional slice work for much cheaper steady-state dequeue behavior
processor cleanup now trusts the authoritative in-memory in-flight tool map during teardown, which is the same state the stream handler mutates during execution

Follow-up measurement plan

After merge, the next useful measurement is an end-to-end agent-loop benchmark on a fixed fixture repo that captures:

wall-clock latency per loop iteration
CPU time
RSS / heap growth under repeated runs
count of summary invocations and tool-cleanup reads

That follow-up would quantify how much these micro-optimizations move real session throughput under realistic load.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

github-actions · 2026-04-08T11:46:45Z

Hey! Your PR title perf(opencode): reduce redundant summary and queue overhead doesn't follow conventional commit format.

Please update it to start with one of:

feat: or feat(scope): new feature
fix: or fix(scope): bug fix
docs: or docs(scope): documentation changes
chore: or chore(scope): maintenance tasks
refactor: or refactor(scope): code refactoring
test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

github-actions · 2026-04-08T11:46:46Z

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

github-actions · 2026-04-08T11:47:29Z

The following comment was made by an LLM, it may be inaccurate:

Potential Related PR Found

PR #20303: refactor(opencode): optimize doom loop detection, summary debounce, parallel plugin events
#20303

This PR may be related because it also addresses summary debouncing/optimization in opencode, which overlaps with PR #21507's work on reducing redundant summary calls and processor cleanup. However, the exact relationship (whether closed, merged, or addressing different aspects) should be verified.

PR #19237: perf(opencode): reduce streaming latency and request overhead
#19237

This PR addresses similar hot-path performance concerns in opencode streaming, though focused on a different area.

github-actions · 2026-04-08T14:36:20Z

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

Core cache optimizations: - Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn cached at BP1 for sessions with SessionMind context) - Split failureContext into stableFailures (prior turns, BP1 cached) and dynamicFailures (current turn only) using signature-based dedup - Add markLargeToolResults() pre-pass: cache_control on tool-result content parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude - Fix stale parts reference bug in markLargeToolResults for multi-tool messages - Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase quality->dimension->fallback compression prevents 5MB API limit errors - Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup - prompt_async idle race condition fix: check new messages before loop break Upstream PR cherry-picks: - PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss - PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction - PR anomalyco#21507: session processor single-flight summary dedup improvements - PR anomalyco#21528: prompt_async idle wakeup race condition fix - PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm New tools (PR anomalyco#21399): - ContextUsageTool (check_context_usage): real-time token/cache usage reporting - NewSessionTool (new_session): TUI-only, abort + create new session - TuiEvent.SessionNew bus event and app.tsx handler - SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type Test infrastructure: - E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+ - Unit tests for large-tool cache breakpoints (4 scenarios) - Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all()) - Add await to all ProviderTransform.message() call sites (now async)

…rchestrator, multi-credential, codebase indexer Core Features: - Session Mind with persistent memory across sessions - Orchestrator + Worker subagent architecture - Multi-credential OAuth with auto-refresh - Codebase indexer and watcher connectors - Footer status bar with live metrics Cache & Prompt Optimizations: - Move mindContext/failureContext to stable system prefix (BP1 cached) - Large tool result cache_control breakpoints (>7000 chars) - Deterministic message wrapping (PR anomalyco#21535) - Tool evidence digest through compaction (PR anomalyco#21492) - O(1) queue dequeue + single-flight summary (PR anomalyco#21507) - Levenshtein O(min(N,M)) space optimization (PR anomalyco#21500) - Three-phase image auto-compression (PR anomalyco#21371) - ContextUsage and NewSession tools (PR anomalyco#21399) - E2E cache integration tests with real Anthropic OAuth Session snapshot resets prevent memory leaks on session delete.

Core cache optimizations: - Move mindContext from dynamicSystem to stableSystem (500-2000+ tokens/turn cached at BP1 for sessions with SessionMind context) - Split failureContext into stableFailures (prior turns, BP1 cached) and dynamicFailures (current turn only) using signature-based dedup - Add markLargeToolResults() pre-pass: cache_control on tool-result content parts >7000 chars (~2000 tokens), Anthropic direct + OpenRouter Claude - Fix stale parts reference bug in markLargeToolResults for multi-tool messages - Add compressImages() async pre-pass via sharp (PR anomalyco#21371): 3-phase quality->dimension->fallback compression prevents 5MB API limit errors - Session snapshot resets (resetFailureSnapshot/resetEnvDynamicSent) in cleanup - prompt_async idle race condition fix: check new messages before loop break Upstream PR cherry-picks: - PR anomalyco#21535: deterministic queued message wrapping eliminates per-turn cache miss - PR anomalyco#21492: tool evidence digest (evidence.ts) preserves context through compaction - PR anomalyco#21507: session processor single-flight summary dedup improvements - PR anomalyco#21528: prompt_async idle wakeup race condition fix - PR anomalyco#21500: Levenshtein O(min(N,M)) space with Int32Array two-row algorithm New tools (PR anomalyco#21399): - ContextUsageTool (check_context_usage): real-time token/cache usage reporting - NewSessionTool (new_session): TUI-only, abort + create new session - TuiEvent.SessionNew bus event and app.tsx handler - SDK types.gen.ts/sdk.gen.ts EventTuiSessionNew type Test infrastructure: - E2E cache tests (OPENCODE_E2E=1) verified 100% cache hit rate on T2+ - Unit tests for large-tool cache breakpoints (4 scenarios) - Fix pre-existing lsp-deps.test.ts assertion bug (LspTool in make() not all()) - Add await to all ProviderTransform.message() call sites (now async)

GuestAUser added 3 commits April 8, 2026 08:45

perf(opencode): dedupe in-flight summary work

4c23fae

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

perf(opencode): optimize async queue dequeue

d75ffc7

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

perf(opencode): avoid processor cleanup rereads

79637ea

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

github-actions bot added the needs:title label Apr 8, 2026

github-actions bot added the needs:compliance This means the issue will auto-close after 2 hours. label Apr 8, 2026

github-actions bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Apr 8, 2026

github-actions bot closed this Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(opencode): reduce redundant summary and queue overhead#21507

perf(opencode): reduce redundant summary and queue overhead#21507
GuestAUser wants to merge 3 commits intoanomalyco:devfrom
GuestAUser:perf/opencode-core-hot-path

GuestAUser commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GuestAUser commented Apr 8, 2026

Summary

Why

What changed

1. Single-flight session summary work

2. O(1) queue dequeue path

3. Processor cleanup avoids redundant reads

Scope

Validation

Typecheck

Targeted tests

Manual QA

Microbench checks

Trade-offs / things to watch

Follow-up measurement plan

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

github-actions bot commented Apr 8, 2026

Potential Related PR Found

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant