-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Bug Description
i have the following stt and turn detection setup
stt=deepgram.STTv2(),
turn_detection="stt",
this was working fine in livekit agents 1.2.8. After upgrading to 1.3, this no longer works well. It often produces phantom VAD activity that would interrupt the agent, even though the user said nothing at all.
Expected Behavior
No phantom VAD activity
Reproduction Steps
- run the following agent code in console mode
import asyncio
from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import AgentServer, AgentSession, Agent, room_io
from livekit.plugins import deepgram
from livekit.plugins import noise_cancellation, silero
load_dotenv()
class Assistant(Agent):
def __init__(self) -> None:
super().__init__(
instructions="""You are a helpful voice AI assistant.
You eagerly assist users with their questions by providing information from your extensive knowledge.
Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols.
You are curious, friendly, and have a sense of humor.""",
)
async def llm_node(self, chat_ctx, tools, model_settings):
try:
async for chunk in Agent.default.llm_node(
self, chat_ctx, tools, model_settings
):
yield chunk
except asyncio.CancelledError:
print("CancelledError in llm_node")
raise
server = AgentServer()
@server.rtc_session()
async def my_agent(ctx: agents.JobContext):
session = AgentSession(
stt=deepgram.STTv2(),
llm="openai/gpt-4.1-mini",
tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
vad=silero.VAD.load(min_silence_duration=0.3),
turn_detection="stt",
allow_interruptions=True,
min_endpointing_delay=0.1,
max_endpointing_delay=1.5,
min_interruption_duration=0.3,
user_away_timeout=4,
)
@session.on("user_state_changed")
def _on_user_state_changed(ev) -> None:
print(f"{'@' * 50} user_state_changed: {ev.new_state} {'@' * 50}")
@session.on("agent_state_changed")
def _on_agent_state_changed(ev) -> None:
print(f"{'*' * 50} agent_state_changed: {ev.new_state} {'*' * 50}")
await session.start(
room=ctx.room,
agent=Assistant(),
room_options=room_io.RoomOptions(
audio_input=room_io.AudioInputOptions(
noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
else noise_cancellation.BVC(),
),
),
)
if __name__ == "__main__":
agents.cli.run_app(server)
- when the agent session starts, say a simple "Hello"
- bug should happen
Note that these steps dont reproduce the bug 100% of the times, but im able to reproduce it about 30% of the times. When the bug is reproduced, this is what the console logs should look like
➜ daikon git:(repro-bug) uv run quickstart.py console
Agents Starting console mode 🚀
14:06:53.701 DEBUG asyncio Using selector: KqueueSelector
14:06:53.702 INFO livekit.agents starting worker {"version": "1.3.6", "rtc-version": "1.0.20"}
14:06:53.703 INFO livekit.agents HTTP server listening on :60009
14:06:53.717 INFO livekit.agents initializing job runner {"tid": 51230842}
DEBUG asyncio Using selector: KqueueSelector
INFO livekit.agents job runner initialized {"tid": 51230842, "elapsed_time": 0.0}
14:06:53.766 DEBUG livekit.agents http_session(): creating a new httpclient ctx
************************************************** agent_state_changed: listening **************************************************
14:06:53.767 DEBUG livekit.agents using audio io: `Console` -> `AgentSession` -> `Console`
WARNI… livekit.agents resume_false_interruption is enabled but audio output does not support pause, it will be ignored {"audio_output": "Console"}
14:06:53.768 DEBUG livekit.agents using transcript io: `AgentSession` -> (none)
14:06:53.991 DEBUG livekit.….deepgram Established new Deepgram STT WebSocket connection:
{"headers": {"dg-project-id": "9e619461-68cf-4601-b7f1-1349880caf93", "dg-request-id": "f518be90-739c-4ecf-8b1c-00cc6f7edf5f", "Date": "Fri, 12
Dec 2025 22:06:53 GMT"}}
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ user_state_changed: speaking @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
14:06:55.988 DEBUG livekit.agents received user transcript {"user_transcript": "Hello?", "language": "en", "transcript_delay": 0.045729875564575195}
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ user_state_changed: listening @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
************************************************** agent_state_changed: thinking **************************************************
CancelledError in llm_node
************************************************** agent_state_changed: listening **************************************************
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ user_state_changed: away @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
14:07:05.648 INFO livekit.agents shutting down worker {"id": "unregistered"}
14:07:05.652 DEBUG livekit.agents session closed {"reason": "user_initiated", "error": null}
14:07:05.654 DEBUG livekit.agents shutting down job task {"reason": "", "user_initiated": false}
14:07:05.655 DEBUG livekit.agents job exiting {"reason": "", "tid": 51230842, "job_id": "fake-job-33f171799214", "room_id": "FAKE_RM_b79cb0e0d1f9"}
14:07:05.657 DEBUG livekit.agents http_session(): closing the httpclient ctx
As you can observe in these logs, the agent state changed from thinking to listening without the user state changing to speaking. Yet, the llm_node got cancelled as noted by the log CancelledError in llm_node
Operating System
macOS Sequoia
Models Used
No response
Package Versions
"livekit-agents[cartesia,deepgram,openai,silero,turn-detector,google,elevenlabs,hume,inworld,assemblyai,mistralai,speechmatics]>=1.3.6"
"livekit-plugins-noise-cancellation>=0.2.5"Session/Room/Call IDs
No response
Proposed Solution
Additional Context
No response
Screenshots and Recordings
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working