Skip to content

feat: add service_tier parameter to Responses API LLM#5342

Closed
piyush-gambhir wants to merge 1 commit intolivekit:mainfrom
piyush-gambhir:feat/responses-service-tier-param
Closed

feat: add service_tier parameter to Responses API LLM#5342
piyush-gambhir wants to merge 1 commit intolivekit:mainfrom
piyush-gambhir:feat/responses-service-tier-param

Conversation

@piyush-gambhir
Copy link
Copy Markdown
Contributor

Summary

The Chat Completions LLM (openai.LLM) already supports the service_tier parameter for configuring priority/flex/default processing per-request. The Responses API LLM (openai.responses.LLM) is missing this parameter despite the OpenAI Responses API supporting it.

This PR adds service_tier to the Responses LLM for parity.

Changes

livekit-plugins/livekit-plugins-openai/.../responses/llm.py (1 file, 6 lines):

  • Add service_tier: NotGivenOr[str] to _LLMOptions
  • Add service_tier parameter to LLM.__init__()
  • Pass service_tier through in chat() via extra kwargs

Usage

from livekit.plugins.openai import responses

llm = responses.LLM(
    model="gpt-5.4",
    service_tier="priority",  # now supported
)

Backward Compatible

  • Defaults to NOT_GIVEN — no impact on existing code
  • Matches the existing pattern used by the Chat Completions LLM

@piyush-gambhir piyush-gambhir force-pushed the feat/responses-service-tier-param branch 2 times, most recently from 8f6a34a to b61e517 Compare April 4, 2026 23:11
The Chat Completions LLM (openai.LLM) already supports the service_tier
parameter for configuring priority/flex/default processing. This adds
the same parameter to the Responses API LLM (openai.responses.LLM) for
parity.

OpenAI's Responses API accepts service_tier in the request body:
https://platform.openai.com/docs/api-reference/responses/create

Changes (responses/llm.py only):
- Add service_tier to _LLMOptions dataclass
- Add service_tier parameter to LLM.__init__()
- Pass service_tier through in chat() via extra kwargs
@piyush-gambhir piyush-gambhir force-pushed the feat/responses-service-tier-param branch from b61e517 to ce469bc Compare April 4, 2026 23:40
@piyush-gambhir
Copy link
Copy Markdown
Contributor Author

The CI failure is from test_blockguard.py::TestStress::test_many_short_blocks — a pre-existing flaky test in livekit-blockguard that's unrelated to this PR.

The test runs 20 × time.sleep(0.02) with a 500ms threshold and expects no blocking detection. On the CI runner (macOS), cumulative scheduling jitter causes the event loop heartbeat to exceed 500ms between watchdog polls, triggering a false positive. This happens because the test's total blocking time (20 × 20ms = 400ms) is close to the threshold, and any GC pause or scheduling delay pushes it over.

This PR only modifies livekit-plugins-openai/responses/llm.py — no relation to blockguard.

@piyush-gambhir
Copy link
Copy Markdown
Contributor Author

Closing to re-trigger CI (flaky blockguard test failure unrelated to this change).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant