feat: expose service_tier in CompletionUsage from OpenAI Responses API by piyush-gambhir · Pull Request #5341 · livekit/agents

piyush-gambhir · 2026-04-04T22:26:21Z

Summary

OpenAI returns service_tier (e.g. "default", "priority", "flex") in every API response, indicating the processing tier that was actually used to serve the request. This is important for accurate cost tracking since priority tier has different billing rates.

Currently, both the Responses API plugin and the Chat Completions inference layer parse usage data but ignore service_tier. This PR adds it to CompletionUsage so downstream consumers can access it.

Changes

livekit-agents/livekit/agents/llm/llm.py: Add service_tier: str | None = None field to CompletionUsage
livekit-plugins/livekit-plugins-openai/.../responses/llm.py: Read event.response.service_tier in _handle_response_completed and pass it to CompletionUsage
livekit-agents/livekit/agents/inference/llm.py: Read chunk.service_tier in Chat Completions stream and pass it to CompletionUsage

Why

OpenAI's Priority Processing bills at a premium rate
When service_tier is configured at the project level, some requests may be downgraded to "default" under ramp rate limits
Without this field, there's no way to know which tier was actually used for billing reconciliation
The field is already present on both the OpenAI Response object and ChatCompletionChunk — just not being read

Design Note

service_tier is semantically response metadata rather than token usage. Placing it on CompletionUsage is a pragmatic choice — CompletionUsage is the object that flows through the metrics/usage collection pipeline (ModelUsageCollector, session reports, etc.), so it propagates automatically with zero changes to the pipeline.

If the maintainers prefer cleaner separation, this could be moved to ChatChunk directly — happy to refactor if that's the preferred approach.

Backward Compatible

service_tier defaults to None — no impact on existing code
Providers that don't support it simply leave it as None

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

OpenAI returns service_tier (e.g. "default", "priority", "flex") in every API response, indicating the processing tier that was actually used. This is important for accurate cost tracking since priority tier has different billing rates. Changes: - Add service_tier field to CompletionUsage (optional, defaults to None) - Read event.response.service_tier in OpenAI Responses plugin's _handle_response_completed and pass it through to CompletionUsage This allows downstream consumers (session reports, webhooks, billing) to know which service tier was used for each LLM call.

davidzhao

lg

* upstream/main: fix: add PARTICIPANT_KIND_CONNECTOR to default participant kinds (livekit#5339) feat: expose service_tier in CompletionUsage from OpenAI Responses API (livekit#5341) feat: answering machine detection (livekit#4906) fix: wait_for_participant waits until participant is fully active (livekit#5271) (gemini realtime): add warnings in update_chat_ctx and update_instructions (livekit#5332) fix: convert oneOf to anyOf in strict schema for discriminated unions (livekit#5324) fix(voice): make function call history preservation configurable in AgentTask (livekit#5288)

* fix(voice): make function call history preservation configurable in AgentTask (livekit#5288) * fix: convert oneOf to anyOf in strict schema for discriminated unions (livekit#5324) * (gemini realtime): add warnings in update_chat_ctx and update_instructions (livekit#5332) * fix: wait_for_participant waits until participant is fully active (livekit#5271) * feat: answering machine detection (livekit#4906) Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> * feat: expose service_tier in CompletionUsage from OpenAI Responses API (livekit#5341) * fix: add PARTICIPANT_KIND_CONNECTOR to default participant kinds (livekit#5339) --------- Co-authored-by: Gopal Bagaswar <67310594+GopalGB@users.noreply.github.com> Co-authored-by: Long Chen <longch1024@gmail.com> Co-authored-by: Tina Nguyen <72938484+tinalenguyen@users.noreply.github.com> Co-authored-by: David Zhao <dz@livekit.io> Co-authored-by: Chenghao Mou <chenghao.mou@livekit.io> Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Piyush Gambhir <90608533+piyush-gambhir@users.noreply.github.com> Co-authored-by: Anunay Maheshwari <anunaym14@gmail.com>

#5341)

devin-ai-integration Bot reviewed Apr 4, 2026

View reviewed changes

piyush-gambhir force-pushed the feat/expose-service-tier branch 3 times, most recently from 91d06ab to 7d13649 Compare April 4, 2026 23:12

piyush-gambhir mentioned this pull request Apr 4, 2026

feat: expose serviceTier in CompletionUsage from OpenAI Responses API livekit/agents-js#1205

Merged

piyush-gambhir force-pushed the feat/expose-service-tier branch from 7d13649 to 3fa5281 Compare April 4, 2026 23:39

davidzhao approved these changes Apr 5, 2026

View reviewed changes

davidzhao merged commit 37ca860 into livekit:main Apr 5, 2026
13 checks passed

russellmartin-livekit pushed a commit that referenced this pull request Apr 13, 2026

feat: expose service_tier in CompletionUsage from OpenAI Responses API (

f362f79

#5341)

AlessandroElyos mentioned this pull request Apr 28, 2026

feat(openai): expose verbosity in Responses LLM #5583

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: expose service_tier in CompletionUsage from OpenAI Responses API#5341

feat: expose service_tier in CompletionUsage from OpenAI Responses API#5341
davidzhao merged 1 commit intolivekit:mainfrom
piyush-gambhir:feat/expose-service-tier

piyush-gambhir commented Apr 4, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 4, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

davidzhao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

piyush-gambhir commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Why

Design Note

Backward Compatible

Related

Uh oh!

CLAassistant commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

davidzhao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

piyush-gambhir commented Apr 4, 2026 •

edited

Loading

CLAassistant commented Apr 4, 2026 •

edited

Loading