Skip to content

Conversation

@chenghao-mou
Copy link
Member

@chenghao-mou chenghao-mou commented Jan 30, 2026

Turns out using STT with a realtime model will trigger a response when committing a turn. I didn't notice the interruption when testing so wrongly assumed it was doing just fine.

Example to reproduce:

    @session.on("agent_state_changed")
    def _on_agent_state_changed(ev: AgentStateChangedEvent):
        if ev.new_state == "speaking":
            logger.info("speaking")
            session.input.set_audio_enabled(False)
        elif ev.new_state == "listening":
            logger.info("listening")
            session.input.set_audio_enabled(True)
            session.clear_user_turn()

            async def commit_user_turn():
                nonlocal session
                logger.info("waiting 20 seconds to commit user turn")
                await asyncio.sleep(20)
                logger.info("committing user turn")
                session.commit_user_turn()
                logger.info("user turn committed")
                session.input.set_audio_enabled(False)

            task = asyncio.create_task(commit_user_turn())
            tasks.add(task)
            task.add_done_callback(tasks.discard)
        else:
            logger.info(f"agent state changed to {ev.new_state}")

The model will respond after 20 seconds

  • the STT transcripts will be committed into the chat context without triggering a duplicate response
  • the model will only respond once when committing a turn manually

cc @bml1g12

Summary by CodeRabbit

  • Breaking Changes
    • Removed the commit_user_turn method from the Realtime Session interface. This method was previously unused and non-functional across implementations and has been eliminated to simplify the API surface and reduce unnecessary complexity. Applications that reference this method require updating for compatibility.

✏️ Tip: You can customize this high-level summary in your review settings.


Open with Devin

@chenghao-mou chenghao-mou requested a review from a team January 30, 2026 17:38
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 30, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

This PR removes the commit_user_turn() abstract method from the RealtimeSession interface and eliminates all corresponding implementations and call sites across the livekit-agents and livekit-plugins repositories, including AWS, Google, OpenAI, and Ultravox integrations.

Changes

Cohort / File(s) Summary
RealtimeSession Interface
livekit-agents/livekit/agents/llm/realtime.py
Removed abstract method commit_user_turn(self) -> None from the base RealtimeSession interface.
Realtime Agent Control
livekit-agents/livekit/agents/voice/agent_activity.py
Removed invocation of self._rt_session.commit_user_turn() in the commit_user_turn method; now only delegates to AudioRecognition.
AWS Plugin Implementation
livekit-plugins/livekit-plugins-aws/livekit/plugins/aws/experimental/realtime/realtime_model.py
Removed commit_user_turn() method that logged a not-supported warning for Nova Sonic's Realtime API.
Google Plugin Implementation
livekit-plugins/livekit-plugins-google/livekit/plugins/google/realtime/realtime_api.py
Removed commit_user_turn() method that logged a not-supported warning for Gemini Realtime API.
OpenAI Plugin Implementations
livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model.py, livekit-plugins/livekit-plugins-openai/livekit/plugins/openai/realtime/realtime_model_beta.py
Removed commit_user_turn() methods that checked auto-response settings, logged warnings, called commit_audio(), and emitted ResponseCreateEvent with empty parameters.
Ultravox Plugin Implementation
livekit-plugins/livekit-plugins-ultravox/livekit/plugins/ultravox/realtime/realtime_model.py
Removed commit_user_turn() method that logged a not-supported warning.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested reviewers

  • bml1g12
  • longcw

Poem

🐰 A method exits stage left with grace,
No more turns to commit to this place,
From interfaces clean to plugins refined,
Simpler flows left behind,
One less hop in the agent's embrace! 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title mentions 'commit user turn with STT and realtime' but the actual changes remove commit_user_turn methods across all realtime implementations and call sites, not add or fix functionality with STT. The title is misleading about the nature of the changes. Consider updating the title to accurately reflect the changes, such as 'refactor: remove commit_user_turn from realtime session implementations' or 'chore: remove unused commit_user_turn method from realtime models'.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/remove-commit-user-turn

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chenghao-mou chenghao-mou changed the title remove commit user turn for realtime models Revert: remove commit user turn for realtime models Jan 30, 2026
Copy link
Member

@davidzhao davidzhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this true when manual turn mode is used?

I see that if it's not server vad, we have to manually create them: https://platform.openai.com/docs/api-reference/realtime-client-events/input_audio_buffer/commit

@chenghao-mou
Copy link
Member Author

is this true when manual turn mode is used?

I see that if it's not server bad, we have to manually create them: platform.openai.com/docs/api-reference/realtime-client-events/input_audio_buffer/commit

I think I found the issue: when using with STT, it actually committed the transcripts to the model and therefore triggered a response. Without STT, it will not respond as expected.

@chenghao-mou chenghao-mou changed the title Revert: remove commit user turn for realtime models fix: commit user turn with STT and realtime Jan 30, 2026
@davidzhao
Copy link
Member

is this true when manual turn mode is used?
I see that if it's not server bad, we have to manually create them: platform.openai.com/docs/api-reference/realtime-client-events/input_audio_buffer/commit

I think I found the issue: when using with STT, it actually committed the transcripts to the model and therefore triggered a response. Without STT, it will not respond as expected.

got it, so the bug is: when STT is used with manual turn detection mode and realtime model, it should not trigger a response.

@chenghao-mou
Copy link
Member Author

is this true when manual turn mode is used?
I see that if it's not server bad, we have to manually create them: platform.openai.com/docs/api-reference/realtime-client-events/input_audio_buffer/commit

I think I found the issue: when using with STT, it actually committed the transcripts to the model and therefore triggered a response. Without STT, it will not respond as expected.

got it, so the bug is: when STT is used with manual turn detection mode and realtime model, it should not trigger a response.

Yep. But it is also kinda ambiguous what users want if they have an STT configured: do they want it only for the transcripts or actual model text input.

@chenghao-mou chenghao-mou marked this pull request as draft January 30, 2026 19:13
@bml1g12
Copy link
Contributor

bml1g12 commented Jan 30, 2026

In our use case we want STT with manual turn taking both as a way to know when it's safe to commit audio before a user turn has ended (as a user turn could be very long, we cannot wait until end of turn to commit) and for its improved transcripts post call versus openai server side transcription

So I think ideally we'd have STT transcripts available and in the local chat context (so if there is a connection error we can restore the conversation on a fresh session) but the remote openai realtime chat context need not even be aware of it as it can function perfectly fine without a transcript of user utterances. STT also provides a neat signal to the user that the system is working during a long user turn (by allowing them to view their own transcription as they talk).

@chenghao-mou chenghao-mou marked this pull request as ready for review February 1, 2026 11:03
@chenghao-mou chenghao-mou requested a review from a team February 1, 2026 11:03
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional flags.

Open in Devin Review

@chenghao-mou chenghao-mou force-pushed the fix/remove-commit-user-turn branch from 6099024 to a3a206e Compare February 1, 2026 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants