Skip to content

feat: answering machine detection#4906

Merged
chenghao-mou merged 34 commits intomainfrom
feat/amd
Apr 3, 2026
Merged

feat: answering machine detection#4906
chenghao-mou merged 34 commits intomainfrom
feat/amd

Conversation

@chenghao-mou
Copy link
Copy Markdown
Member

@chenghao-mou chenghao-mou commented Feb 20, 2026

  • AMDResult with categories: human, machine-ivr, machine-vm, machine-unavailable, and uncertain
  • amd.execute() API for agents to await detection results
  • Example in ‎examples/telephony/amd.py

Usage

await session.start(
    agent=MyAgent(),
    room=ctx.room,
)

async with AMD(session, llm="openai/gpt-5-mini") as detector:
    result = await detector.execute()

    if result.category == "human":
        logger.info("human answered the call, proceeding with normal conversation")
    elif result.category == "machine-ivr":
        logger.info("ivr menu detected, starting navigation")
    elif result.category == "machine-vm":
        logger.info("voicemail detected, leaving a message")
        speech_handle = session.generate_reply(
            instructions=(
                "You've reached voicemail. Leave a brief message asking "
                "the customer to call back."
            ),
        )
        await speech_handle.wait_for_playout()
        session.shutdown()
    elif result.category == "machine-unavailable":
        logger.info("mailbox unavailable, ending call")
        session.shutdown()

@chenghao-mou chenghao-mou marked this pull request as ready for review March 3, 2026 17:37
@chenghao-mou chenghao-mou requested a review from a team March 3, 2026 17:37
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@chenghao-mou chenghao-mou marked this pull request as draft March 9, 2026 22:18
Comment thread livekit-agents/livekit/agents/voice/amd/base.py Outdated
chenghao-mou and others added 4 commits March 15, 2026 16:41
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@chenghao-mou chenghao-mou marked this pull request as ready for review March 15, 2026 20:02
@chenghao-mou chenghao-mou requested a review from a team March 15, 2026 20:02
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

Comment thread examples/telephony/amd.py
Comment thread livekit-agents/livekit/agents/voice/amd/detector.py
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@chenghao-mou chenghao-mou changed the title feat: automatic machine detection feat: answering machine detection Apr 3, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 11 additional findings in Devin Review.

Open in Devin Review

Comment thread livekit-agents/livekit/agents/voice/amd/classifier.py Outdated
Comment thread livekit-agents/livekit/agents/voice/amd/detector.py Outdated
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 15 additional findings in Devin Review.

Open in Devin Review

Comment on lines +137 to +143
def _on_first_audio(self) -> None:
"""Start AMD on the first audio frame and pause speech authorization."""
if self._classifier is None or self._classifier.started:
return
self._classifier.start()
if self._session is not None and self._session._activity is not None:
self._session._activity._pause_authorization()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 AMD authorization pause not applied to new AgentActivity created during agent handoff

When AMD pauses authorization via _pause_authorization() on the current AgentActivity, and then an agent handoff occurs (e.g., via update_agent), a new AgentActivity is created with _authorization_allowed initialized as set (livekit-agents/livekit/agents/voice/agent_activity.py:155-156). The AMD's _on_first_audio only fires once (it checks self._classifier.started at detector.py:139 and returns), so _pause_authorization() is never called on the new activity. This means the new activity's speech will bypass AMD's authorization gate, defeating the purpose of holding speech until AMD resolves.

Scenario
  1. AMD starts, calls _pause_authorization() on current activity
  2. Agent handoff occurs (e.g. user calls session.update_agent()) while AMD is still pending
  3. New AgentActivity is created with _authorization_allowed already set
  4. Speech on the new activity proceeds without waiting for AMD result
Prompt for agents
The AMD detector pauses authorization on the current AgentActivity, but if an agent handoff creates a new AgentActivity while AMD is still pending, the new activity won't have authorization paused. To fix this, either: (1) propagate the AMD authorization pause state to new AgentActivity instances when they are created in _update_activity (in agent_session.py), e.g. by checking if session._amd is pending and calling _pause_authorization() on the new activity; or (2) have the AMD store a reference to the session rather than the activity and apply the pause on whatever is the current activity at any given time, checking this in the activity's authorization wait path.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +102 to +114
result = self._result

if result.is_machine and self._interrupt_on_machine:
await self._session.interrupt(force=True)

if result.category == AMDCategory.MACHINE_IVR and self._ivr_detection:
await self._session._start_ivr_detection(transcript=result.transcript)

# eagerly resume so agent can speak immediately to a human
if self._session._activity is not None:
self._session._activity._resume_authorization()

return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 execute() does not resume authorization if an exception occurs before _resume_authorization()

In execute(), if self._session.interrupt(force=True) (line 105) or self._session._start_ivr_detection(...) (line 108) raises an exception, the _resume_authorization() call at line 112 is skipped. When execute() is used inside the async with AMD(...) context manager, __aexit__aclose() will resume authorization as a fallback. However, if execute() is called directly (without the context manager), authorization remains permanently paused, deadlocking all subsequent speech generation.

Suggested change
result = self._result
if result.is_machine and self._interrupt_on_machine:
await self._session.interrupt(force=True)
if result.category == AMDCategory.MACHINE_IVR and self._ivr_detection:
await self._session._start_ivr_detection(transcript=result.transcript)
# eagerly resume so agent can speak immediately to a human
if self._session._activity is not None:
self._session._activity._resume_authorization()
return result
result = self._result
try:
if result.is_machine and self._interrupt_on_machine:
await self._session.interrupt(force=True)
if result.category == AMDCategory.MACHINE_IVR and self._ivr_detection:
await self._session._start_ivr_detection(transcript=result.transcript)
finally:
# eagerly resume so agent can speak immediately to a human
if self._session._activity is not None:
self._session._activity._resume_authorization()
return result
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@chenghao-mou chenghao-mou merged commit 2f604bc into main Apr 3, 2026
20 of 22 checks passed
@chenghao-mou chenghao-mou deleted the feat/amd branch April 3, 2026 22:41
osimhi213 added a commit to de-id/livekit-agents that referenced this pull request Apr 5, 2026
* upstream/main:
  fix: add PARTICIPANT_KIND_CONNECTOR to default participant kinds (livekit#5339)
  feat: expose service_tier in CompletionUsage from OpenAI Responses API (livekit#5341)
  feat: answering machine detection (livekit#4906)
  fix: wait_for_participant waits until participant is fully active (livekit#5271)
  (gemini realtime): add warnings in update_chat_ctx and update_instructions (livekit#5332)
  fix: convert oneOf to anyOf in strict schema for discriminated unions (livekit#5324)
  fix(voice): make function call history preservation configurable in AgentTask (livekit#5288)
osimhi213 added a commit to de-id/livekit-agents that referenced this pull request Apr 5, 2026
* fix(voice): make function call history preservation configurable in AgentTask (livekit#5288)

* fix: convert oneOf to anyOf in strict schema for discriminated unions (livekit#5324)

* (gemini realtime): add warnings in update_chat_ctx and update_instructions (livekit#5332)

* fix: wait_for_participant waits until participant is fully active (livekit#5271)

* feat: answering machine detection (livekit#4906)

Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>

* feat: expose service_tier in CompletionUsage from OpenAI Responses API (livekit#5341)

* fix: add PARTICIPANT_KIND_CONNECTOR to default participant kinds (livekit#5339)

---------

Co-authored-by: Gopal Bagaswar <67310594+GopalGB@users.noreply.github.com>
Co-authored-by: Long Chen <longch1024@gmail.com>
Co-authored-by: Tina Nguyen <72938484+tinalenguyen@users.noreply.github.com>
Co-authored-by: David Zhao <dz@livekit.io>
Co-authored-by: Chenghao Mou <chenghao.mou@livekit.io>
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Piyush Gambhir <90608533+piyush-gambhir@users.noreply.github.com>
Co-authored-by: Anunay Maheshwari <anunaym14@gmail.com>
russellmartin-livekit pushed a commit that referenced this pull request Apr 13, 2026
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants