TL;DR
OpenAI Realtime: response.error mid-flight causes future to hang indefinitely (TODO at realtime_model.py:2005)
Summary
When the OpenAI Realtime substrate sends a response.error event mid-flight (during a generation), the future associated with that response is never resolved — it hangs indefinitely. Callers awaiting the future block forever.
The source code at livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py:2005 acknowledges this with a TODO comment.
Reproduction
from livekit.plugins.openai.realtime import RealtimeModel
session = RealtimeModel(model="gpt-4o-realtime-preview-2025-06-03").session()
fut = session.generate_reply(instructions="...")
# If OpenAI sends response.error mid-stream (e.g., due to substrate
# rate limit, content policy violation, or similar), the future
# `fut` will never resolve.
try:
result = await asyncio.wait_for(fut, timeout=30.0)
except asyncio.TimeoutError:
# This is the only way to detect the hang from the caller side
print("future hung; substrate sent response.error and we didn't handle it")
Source code reference
Around line 2005 in realtime_model.py, in _handle_response_error (or equivalent):
def _handle_response_error(self, event: ResponseErrorEvent) -> None:
# TODO: handle response.error mid-flight
# Currently the response future hangs indefinitely.
...
(Exact TODO text and surrounding code may differ in current main; the line number reference is from the version this issue was authored against.)
Proposed fix
When response.error arrives mid-flight, the handler should:
- Identify the affected response by
event.response_id
- Pop the future from
_response_created_futures (or equivalent state)
- Resolve the future with
RealtimeError(event.error.message) or similar
- Clean up any associated state (e.g., the in-flight
_ResponseGeneration entry)
def _handle_response_error(self, event: ResponseErrorEvent) -> None:
"""Handle response.error: reject the associated future and clean up."""
response_id = event.response_id # may need to derive from event shape
error_msg = event.error.message if event.error else "unknown error"
# Reject any pending response_created future (matches by client_event_id
# in metadata, if available; otherwise iterate)
if event.metadata and (event_id := event.metadata.get("client_event_id")):
if (fut := self._response_created_futures.pop(event_id, None)) is not None:
if not fut.done():
fut.set_exception(llm.RealtimeError(error_msg))
# Clean up the in-flight generation state
if self._current_generation is not None:
# ... existing cleanup logic ...
self._close_current_generation(reason=f"response.error: {error_msg}")
Impact
Production agents experiencing this bug see:
- Hung futures consuming async task slots
- Apparent "frozen" agent state when the substrate errors out
- Difficult-to-diagnose timeouts upstream of the call site
- Need for upstream
asyncio.wait_for(...) wrappers (defensive code that shouldn't be necessary)
Acceptance criteria
_handle_response_error resolves the affected future with RealtimeError instead of leaving it hanging.
- A test verifies the failure mode: simulate
response.error mid-flight, assert the future resolves with RealtimeError.
- The TODO comment at
realtime_model.py:2005 is removed.
Related
- Source code TODO at
realtime_model.py:2005
- Adjacent race conditions documented in source at
realtime_model.py:1870 (response.done without prior response.created)
TL;DR
OpenAI Realtime: response.error mid-flight causes future to hang indefinitely (TODO at realtime_model.py:2005)
Summary
When the OpenAI Realtime substrate sends a
response.errorevent mid-flight (during a generation), the future associated with that response is never resolved — it hangs indefinitely. Callers awaiting the future block forever.The source code at
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py:2005acknowledges this with a TODO comment.Reproduction
Source code reference
Around line 2005 in
realtime_model.py, in_handle_response_error(or equivalent):(Exact TODO text and surrounding code may differ in current main; the line number reference is from the version this issue was authored against.)
Proposed fix
When
response.errorarrives mid-flight, the handler should:event.response_id_response_created_futures(or equivalent state)RealtimeError(event.error.message)or similar_ResponseGenerationentry)Impact
Production agents experiencing this bug see:
asyncio.wait_for(...)wrappers (defensive code that shouldn't be necessary)Acceptance criteria
_handle_response_errorresolves the affected future withRealtimeErrorinstead of leaving it hanging.response.errormid-flight, assert the future resolves withRealtimeError.realtime_model.py:2005is removed.Related
realtime_model.py:2005realtime_model.py:1870(response.done without prior response.created)