feat: add StreamChunk::Usage variant for streaming token usage#96
Open
nazq wants to merge 1 commit intograniet:mainfrom
Open
feat: add StreamChunk::Usage variant for streaming token usage#96nazq wants to merge 1 commit intograniet:mainfrom
nazq wants to merge 1 commit intograniet:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds streaming token usage support by introducing a new StreamChunk::Usage(Usage) variant. This enables consumers to receive token usage data (including cache hits/misses) during streaming operations, which was previously only available in non-streaming responses.
Key Changes:
- Added
StreamChunk::Usage(Usage)variant to expose token usage during streaming - Implemented usage parsing from both Anthropic
message_deltaevents and OpenAI final chunks - Usage is consistently emitted immediately before the
Donechunk
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/chat/mod.rs |
Added new StreamChunk::Usage(Usage) enum variant for streaming token usage |
src/backends/anthropic.rs |
Added usage field to AnthropicStreamResponse, implemented convert_anthropic_usage() helper, updated parser to return Vec<StreamChunk> and emit usage before Done |
src/providers/openai_compatible.rs |
Added usage field to OpenAIToolStreamChunk, implemented logic to handle both inline and separate chunk usage patterns |
tests/test_backends.rs |
Updated integration test to handle new Usage variant in match statement |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Emits Usage chunk from both Anthropic and OpenAI streaming responses: - Anthropic: extracts usage from message_delta event - OpenAI: extracts usage from final chunk (requires stream_options.include_usage) - Usage is emitted immediately before Done chunk - Includes cache token support via prompt_tokens_details
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
StreamChunk::Usage(Usage)variant to expose token usage data during streamingmessage_deltaeventsstream_options.include_usageis set)Donefor predictable consumer patternsMotivation
When using
chat_stream_with_tools, we need a way to get token usage data (input_tokens, output_tokens, cache_read_tokens, etc.) that's available in non-streamingChatResponse::usage(). Both Anthropic and OpenAI include usage in their final streaming events, but this data wasn't being exposed.Changes
src/chat/mod.rsStreamChunk::Usage(Usage)variantsrc/backends/anthropic.rsusagefield toAnthropicStreamResponse, createdconvert_anthropic_usage()helper, updated parser to emit Usage before Donesrc/providers/openai_compatible.rsusagefield toOpenAIToolStreamChunk, updated parser to emit Usage (handles both inline and separate chunk cases)tests/test_backends.rsAPI Behavior
Anthropic:
message_deltaevents include cumulativeusagefieldStreamChunk::Usageimmediately beforeStreamChunk::DoneOpenAI:
usagewhenstream_options.include_usage: true(already configured)finish_reason, or in separate empty-choices chunkUsage Example
Test Plan