refactor(opencode): reduce streaming latency and request overhead#19237
refactor(opencode): reduce streaming latency and request overhead#19237jwcrystal wants to merge 1 commit intoanomalyco:devfrom
Conversation
|
Hey! Your PR title Please update it to start with one of:
Where See CONTRIBUTING.md for details. |
|
One more clarification on the verification: I re-checked both the code changes and the benchmark interpretation, and my current conclusion is that all four optimizations in this PR are valid and the direction of the improvement is real. Confirmed changes:
Local targeted measurements against
Important caveats:
So I would treat the exact percentages as directional, not definitive. If I got anything wrong in the setup or interpretation, please let me know. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
- bus: downgrade high-frequency publish log from info to debug, eliminating per-token string formatting and file write during streaming - llm: parallelize chat.params and chat.headers plugin triggers with Promise.all, reducing time-to-first-token by running them concurrently - plugin: use hooks array captured at init time in Bus.subscribeAll handler instead of re-fetching via async state() on every bus event - processor: remove unnecessary await from text-delta and reasoning-delta handlers since updatePartDelta is already fire-and-forget internally, eliminating extra microtask boundaries on every streamed token https://claude.ai/code/session_019vzu636m2GPdKCi2sUdUic
8f77f79 to
283a48d
Compare
|
The sequential order of |
Issue for this PR
Closes #
Type of change
What does this PR do?
This PR reduces overhead in the streaming hot path in
packages/opencode.It makes three focused changes in the current
devimplementation:message.part.deltalogging frominfotodebuginBus.publishchat.paramsandchat.headersin parallel insession/llm.tsawaitonSession.updatePartDelta(...)in the processor pathThese changes reduce avoidable per-token work and request setup overhead without intending to change behavior.
How did you verify your code works?
bun test test/session/llm.test.ts test/session/session.test.tsbun typecheckI also re-ran local targeted verification against
devusing temporary instrumentation on the exact hot paths touched by this PR.Observed local results:
llm.plugins:dev 3.28ms->PR 0.81mspart_deltaover 5000 updates:dev 38.23ms->PR 2.05msbus.publish.part_deltaover 5000 updates:dev 98.67ms->PR 1.30msCaveats:
part_deltaoverlaps withbus.publish.part_delta, so those gains should not be added togetherScreenshots / recordings
N/A
Checklist