Skip to content

Phantom resumed false interrupted speech activity that severely delays speech playback when using a STT model with endpointing #4615

@yiphei

Description

@yiphei

Bug Description

i have the following stt and turn detection setup, which follows official guidance (https://docs.livekit.io/agents/models/stt/plugins/deepgram/#turn-detection)

        stt=deepgram.STTv2(),
        turn_detection="stt",

This often produces phantom resumed false interrupted speech activity that severely delays speech playback when using a STT model with endpointing

Expected Behavior

No phantom resumed false interrupted speech activity and no delayed speech playback

Reproduction Steps

  1. run the following agent code in any mode (console, dev, or start)
import logging

from dotenv import load_dotenv
from livekit import agents, rtc
from livekit.agents import Agent, AgentServer, AgentSession, room_io
from livekit.plugins import deepgram, noise_cancellation, silero

load_dotenv()

logger = logging.getLogger("livekit.agents")


class Assistant(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions="""You are a helpful voice AI assistant.
            You eagerly assist users with their questions by providing information from your extensive knowledge.
            Your responses are concise, to the point, and without any complex formatting or punctuation including emojis, asterisks, or other symbols.
            You are curious, friendly, and have a sense of humor.""",
        )


server = AgentServer()


@server.rtc_session()
async def my_agent(ctx: agents.JobContext):
    session = AgentSession(
        stt=deepgram.STTv2(),
        llm="openai/gpt-4.1-mini",
        tts="cartesia/sonic-3:9626c31c-bec5-4cca-baa8-f8ba9e84c8bc",
        vad=silero.VAD.load(),
        turn_detection="stt",
    )

    await session.start(
        room=ctx.room,
        agent=Assistant(),
        room_options=room_io.RoomOptions(
            audio_input=room_io.AudioInputOptions(
                noise_cancellation=lambda params: noise_cancellation.BVCTelephony()
                if params.participant.kind == rtc.ParticipantKind.PARTICIPANT_KIND_SIP
                else noise_cancellation.BVC(),
            ),
        ),
    )

    @session.on("agent_false_interruption")
    def _on_agent_false_interruption(ev) -> None:
        logger.error(f"Agent false interruption: {ev}")

    await session.generate_reply(
        instructions="Greet the user and offer your assistance."
    )


if __name__ == "__main__":
    agents.cli.run_app(server)
  1. when the agent session starts, converse with the agent. Ensure maximum silence after finishing your turn, thus avoiding potential false positives
  2. bug should happen

here are the logs of the issue reproed

    Agents   Starting console mode 🚀

    12:49:36.090 DEBUG  asyncio            Using selector: KqueueSelector
    12:49:36.091 INFO   livekit.agents     starting worker {"version": "1.3.12", "rtc-version": "1.0.23"}
    12:49:36.092 INFO   livekit.agents     HTTP server listening on :60007
    12:49:36.105 INFO   livekit.agents     initializing job runner {"tid": 32515694}
                 DEBUG  asyncio            Using selector: KqueueSelector
    12:49:36.106 INFO   livekit.agents     job runner initialized {"tid": 32515694, "elapsed_time": 0.0}
    12:49:36.153 DEBUG  livekit.agents     http_session(): creating a new httpclient ctx
    12:49:36.156 DEBUG  livekit.agents     using audio io: `Console` -> `AgentSession` -> `TranscriptSynchronizer` -> `Console`
    12:49:36.157 DEBUG  livekit.agents     using transcript io: `AgentSession` -> `TranscriptSynchronizer`
    12:49:36.281 DEBUG  livekit.….deepgram Established new Deepgram STT WebSocket connection:
                                           {"headers": {"dg-project-id": "9e619461-68cf-4601-b7f1-1349880caf93", "dg-request-id": "ed940073-0a0b-42ae-a298-da53c14e3404", "Date": "Sun, 25
Jan 2026 20:49:36 GMT"}}
    12:49:41.084 DEBUG  livekit.agents     received user transcript {"user_transcript": "Hello.", "language": "en", "transcript_delay": 0.3493931293487549}
    12:49:43.240 DEBUG  livekit.agents     resumed false interrupted speech {"timeout": 2.0}
    12:49:43.242 ERROR  livekit.agents     Agent false interruption: type='agent_false_interruption' resumed=True created_at=1769374183.241974 message=None extra_instructions=None
    12:49:45.898 DEBUG  livekit.agents     received user transcript {"user_transcript": "How are you?", "language": "en", "transcript_delay": 0.06121706962585449}
    12:49:48.341 DEBUG  livekit.agents     resumed false interrupted speech {"timeout": 2.0}
    12:49:48.343 ERROR  livekit.agents     Agent false interruption: type='agent_false_interruption' resumed=True created_at=1769374188.343526 message=None extra_instructions=None
    12:49:52.762 DEBUG  livekit.agents     received user transcript {"user_transcript": "What's your name?", "language": "en", "transcript_delay": 0.20167779922485352}
    12:49:55.066 DEBUG  livekit.agents     resumed false interrupted speech {"timeout": 2.0}
    12:49:55.068 ERROR  livekit.agents     Agent false interruption: type='agent_false_interruption' resumed=True created_at=1769374195.0682151 message=None extra_instructions=None
    12:49:58.210 INFO   livekit.agents     shutting down worker {"id": "unregistered"}
    12:50:00.216 DEBUG  livekit.agents     session closed {"reason": "user_initiated", "error": null}
    12:50:00.220 DEBUG  livekit.agents     shutting down job task {"reason": "", "user_initiated": false}
    12:50:00.222 DEBUG  livekit.agents     job exiting {"reason": "", "tid": 32515694, "job_id": "fake-job-50bd583d5b46", "room_id": "FAKE_RM_de5521d0f6b1"}
    12:50:00.224 DEBUG  livekit.agents     http_session(): closing the httpclient ctx

you can notice these logs after every single user turn

    12:49:48.341 DEBUG  livekit.agents     resumed false interrupted speech {"timeout": 2.0}
    12:49:48.343 ERROR  livekit.agents     Agent false interruption: type='agent_false_interruption' resumed=True created_at=1769374188.343526 message=None extra_instructions=None

Operating System

macOS Sequoia

Models Used

No response

Package Versions

"livekit-agents[cartesia,deepgram,openai,silero,turn-detector,elevenlabs]>=1.3.12",
    "livekit-plugins-noise-cancellation>=0.2.5",

Session/Room/Call IDs

No response

Proposed Solution

Additional Context

No response

Screenshots and Recordings

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions