OpenAI Realtime: response.error mid-flight causes future to hang indefinitely (TODO at realtime_model.py:2005)

## TL;DR

OpenAI Realtime: response.error mid-flight causes future to hang indefinitely (TODO at realtime_model.py:2005)

## Summary

When the OpenAI Realtime substrate sends a `response.error` event mid-flight (during a generation), the future associated with that response is never resolved — it hangs indefinitely. Callers awaiting the future block forever.

The source code at [`livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py:2005`](https://github.com/livekit/agents/blob/main/livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py) acknowledges this with a TODO comment.

## Reproduction

```python
from livekit.plugins.openai.realtime import RealtimeModel

session = RealtimeModel(model="gpt-4o-realtime-preview-2025-06-03").session()
fut = session.generate_reply(instructions="...")
# If OpenAI sends response.error mid-stream (e.g., due to substrate
# rate limit, content policy violation, or similar), the future
# `fut` will never resolve.
try:
    result = await asyncio.wait_for(fut, timeout=30.0)
except asyncio.TimeoutError:
    # This is the only way to detect the hang from the caller side
    print("future hung; substrate sent response.error and we didn't handle it")
```

## Source code reference

Around line 2005 in `realtime_model.py`, in `_handle_response_error` (or equivalent):

```python
def _handle_response_error(self, event: ResponseErrorEvent) -> None:
    # TODO: handle response.error mid-flight
    # Currently the response future hangs indefinitely.
    ...
```

(Exact TODO text and surrounding code may differ in current main; the line number reference is from the version this issue was authored against.)

## Proposed fix

When `response.error` arrives mid-flight, the handler should:

1. Identify the affected response by `event.response_id`
2. Pop the future from `_response_created_futures` (or equivalent state)
3. Resolve the future with `RealtimeError(event.error.message)` or similar
4. Clean up any associated state (e.g., the in-flight `_ResponseGeneration` entry)

```python
def _handle_response_error(self, event: ResponseErrorEvent) -> None:
    """Handle response.error: reject the associated future and clean up."""
    response_id = event.response_id  # may need to derive from event shape
    error_msg = event.error.message if event.error else "unknown error"

    # Reject any pending response_created future (matches by client_event_id
    # in metadata, if available; otherwise iterate)
    if event.metadata and (event_id := event.metadata.get("client_event_id")):
        if (fut := self._response_created_futures.pop(event_id, None)) is not None:
            if not fut.done():
                fut.set_exception(llm.RealtimeError(error_msg))

    # Clean up the in-flight generation state
    if self._current_generation is not None:
        # ... existing cleanup logic ...
        self._close_current_generation(reason=f"response.error: {error_msg}")
```

## Impact

Production agents experiencing this bug see:
- Hung futures consuming async task slots
- Apparent "frozen" agent state when the substrate errors out
- Difficult-to-diagnose timeouts upstream of the call site
- Need for upstream `asyncio.wait_for(...)` wrappers (defensive code that shouldn't be necessary)

## Acceptance criteria

- `_handle_response_error` resolves the affected future with `RealtimeError` instead of leaving it hanging.
- A test verifies the failure mode: simulate `response.error` mid-flight, assert the future resolves with `RealtimeError`.
- The TODO comment at `realtime_model.py:2005` is removed.

## Related

- Source code TODO at `realtime_model.py:2005`
- Adjacent race conditions documented in source at `realtime_model.py:1870` (response.done without prior response.created)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI Realtime: response.error mid-flight causes future to hang indefinitely (TODO at realtime_model.py:2005) #5566

TL;DR

Summary

Reproduction

Source code reference

Proposed fix

Impact

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OpenAI Realtime: response.error mid-flight causes future to hang indefinitely (TODO at realtime_model.py:2005) #5566

Description

TL;DR

Summary

Reproduction

Source code reference

Proposed fix

Impact

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions