-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Closed
Closed
Copy link
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Description
When using min_interruption_words to prevent agent interruptions from short backchannel utterances, these transcripts are buffered and incorrectly included in the next user turn, causing confusion for the LLM.
Problem Details
- Agent is speaking and user says a backchannel (e.g., "uh-huh", "okay")
- If the utterance has fewer words than min_interruption_words, the agent continues speaking
- The transcript gets stored in AudioRecognition buffers (_audio_transcript, _audio_interim_transcript)
- When the agent finishes speaking, these buffered transcripts appear in the next user turn
- The LLM receives stale backchannel transcripts as if they were new user input
Expected Behavior
Backchannel utterances that don't meet the interruption threshold should be discarded and not included in subsequent user turns.
Reproduction Steps
1.
2.
3.
...
- Sample code snippet, or a GitHub Gist link -Operating System
Linux
Models Used
No response
Package Versions
livekit-agents==1.3.9Session/Room/Call IDs
No response
Proposed Solution
Implement a method in AudioRecognition class to clear stale transcripts that are older than a threshold time.Additional Context
No response
Screenshots and Recordings
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working