Skip to content

fix(tts): unblock FallbackAdapter when primary provider fails silently#1218

Merged
toubatbrian merged 4 commits intolivekit:mainfrom
lottiehq-oss:emdash/fix-tts-fallback-450
Apr 9, 2026
Merged

fix(tts): unblock FallbackAdapter when primary provider fails silently#1218
toubatbrian merged 4 commits intolivekit:mainfrom
lottiehq-oss:emdash/fix-tts-fallback-450

Conversation

@mrniket
Copy link
Copy Markdown
Contributor

@mrniket mrniket commented Apr 9, 2026

Description

tts.FallbackAdapter silently hangs when the primary provider fails without emitting any audio (e.g. ElevenLabs closing its WebSocket with 1008 Invalid API key). markUnAvailable(0) is never called, the secondary provider is never tried, and the call stays silent until the caller disconnects. This PR fixes three bugs that combined to produce that symptom, plus adds regression tests.

Changes Made

  • agents/src/tts/tts.ts — Base SynthesizeStream now closes this.output when mainTask settles, not just this.queue. Previously this.output was only closed inside #monitorMetricsTask.finally(...), which is only attached on the first pushText() call. FallbackSynthesizeStream drives the inner stream by calling pushText from its own scheduler, so if the inner plugin's run() threw before any pushText had been scheduled, this.output stayed open forever and the consumer's Promise.allSettled never resolved. The fire-and-forget task also swallows the rejection explicitly (the error is already emitted via emitError) to avoid unhandled rejections.

  • agents/src/tts/fallback_adapter.tsFallbackSynthesizeStream.run() now tracks sawRawAudio separately from audioPushed. When the primary's sample rate differs from the adapter's output rate a resampler is created, and on @livekit/rtc-node@0.13.25 an unused AudioResampler.flush() can return a phantom frame — enough to flip audioPushed to true and make the adapter treat a completely silent failure as a success. resampler.flush() is now only called when real audio actually went in, and the silent-failure check is gated on sawRawAudio.

  • plugins/elevenlabs/src/tts.tsTTS.stream() now accepts { connOptions } and threads it through SynthesizeStream's constructor to the base class. Previously the plugin silently dropped the argument, so FallbackAdapter.maxRetryPerTTS was ignored and fallback always took ~6 s (3 inner retries × 2 s backoff) regardless of configuration.

  • agents/src/tts/fallback_adapter.test.ts (new) — Two regression tests. The first uses matched sample rates and exercises the deadlock path; without the tts.ts fix it hangs and a 3 s timeout fires. The second uses mismatched sample rates (22 050 → 24 000) so a resampler is created, exercising the phantom-flush path; without the fallback_adapter.ts fix the adapter never falls back to the secondary.

Pre-Review Checklist

  • Build passes: pnpm build, pnpm vitest run agents/src (596/596 pass), pnpm -F @livekit/agents lint, pnpm format:check all clean locally.
  • AI-generated code reviewed: comments trimmed to load-bearing why explanations.
  • Changes explained: each of the three bugs described above with its root cause.
  • Scope appropriate: all three changes are different angles on the same symptom (FallbackAdapter silently hanging on primary failure) and were discovered sequentially while investigating.

Testing

  • Automated tests added (agents/src/tts/fallback_adapter.test.ts).
  • All tests pass: pnpm vitest run agents/src → 38 test files, 596 passed, 2 skipped.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 9, 2026

🦋 Changeset detected

Latest commit: d31ccb9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 22 packages
Name Type
@livekit/agents Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugins-test Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 9, 2026

CLA assistant check
All committers have signed the CLA.

@mrniket mrniket force-pushed the emdash/fix-tts-fallback-450 branch from b677c63 to 79c864c Compare April 9, 2026 14:36
@mrniket mrniket marked this pull request as ready for review April 9, 2026 14:45
mrniket added 2 commits April 9, 2026 15:47
…pushText

The base SynthesizeStream only closed `this.output` via `monitorMetrics`,
which was only started on the first `pushText`. If a plugin's `run()` threw
before any text was pushed (e.g. invalid API key on a streaming provider),
`output` stayed open forever and FallbackSynthesizeStream's processOutput
hung on `stream.output.next()`. The fallback loop never reached its catch
block, never called `markUnAvailable`, and never moved to the next provider.

Also forward connOptions through the ElevenLabs plugin's stream() so that
FallbackAdapter.maxRetryPerTTS is actually honoured (previously silently
dropped, falling back to the default maxRetry of 3).
When the primary TTS in FallbackAdapter has a different sample rate from
the adapter's aggregated output rate, FallbackSynthesizeStream.run() wraps
each inner frame in a resampler before forwarding it. Previously, if the
inner stream yielded no real audio at all (e.g. invalid API key on a
streaming provider), the code would still call `resampler.flush()` on a
resampler that had nothing pushed into it.

On @livekit/rtc-node@0.13.25 an unused resampler's `flush()` returns a
ghost frame, which flipped `this.audioPushed = true` and made the
fallback adapter treat a completely silent primary as a success — so it
never marked the provider unavailable and never tried the secondary.

Track `sawRawAudio` separately from `audioPushed` and:
- only call `resampler.flush()` if real audio was actually iterated
- gate the "no audio received" check on `sawRawAudio` instead of
  `audioPushed`, so phantom resampler output can never mask a silent
  failure

Add a regression test with mismatched sample rates to exercise the
resampler branch.
@mrniket mrniket force-pushed the emdash/fix-tts-fallback-450 branch from 79c864c to 8829174 Compare April 9, 2026 14:47
devin-ai-integration[bot]

This comment was marked as resolved.

mrniket and others added 2 commits April 9, 2026 18:22
FallbackChunkedStream.run() had the identical phantom AudioResampler.flush()
vulnerability that was fixed in FallbackSynthesizeStream. When the primary
TTS has a different sample rate from the adapter's output rate and emits
no real audio (e.g. invalid API key on a streaming provider), resampler.flush()
on the unused resampler can return a ghost frame on @livekit/rtc-node@0.13.25,
flipping 'audioReceived' to true and making adapter.synthesize() incorrectly
treat a silent failure as a success — so the non-streaming path never falls
back to the secondary provider.

Apply the same sawRawAudio tracking pattern and drop the now-redundant
audioReceived bookkeeping. Add a regression test covering the synthesize()
path with mismatched sample rates.
@toubatbrian toubatbrian merged commit f7d11de into livekit:main Apr 9, 2026
6 checks passed
@github-actions github-actions Bot mentioned this pull request Apr 9, 2026
@mrniket mrniket deleted the emdash/fix-tts-fallback-450 branch April 14, 2026 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants