Skip to content

feat(voice): add PreemptiveGenerationOptions for fine-grained control#1265

Merged
toubatbrian merged 5 commits intomainfrom
claude/practical-archimedes-mqsTa
Apr 17, 2026
Merged

feat(voice): add PreemptiveGenerationOptions for fine-grained control#1265
toubatbrian merged 5 commits intomainfrom
claude/practical-archimedes-mqsTa

Conversation

@toubatbrian
Copy link
Copy Markdown
Contributor

@toubatbrian toubatbrian commented Apr 16, 2026

Summary

Port of livekit/agents#5428 — adds PreemptiveGenerationOptions with configurable options to reduce wasted compute during preemptive generation.

Changes

  • New PreemptiveGenerationOptions interface (turn_config/preemptive_generation.ts) with:

    • enabled (default true): whether preemptive generation is active
    • preemptiveTts (default false): when false, only LLM runs preemptively; TTS starts once the turn is confirmed and speech is scheduled
    • maxSpeechDuration (default 10000 ms): skip preemptive generation when user has been speaking too long, since long utterances are more likely to change
    • maxRetries (default 3): cap preemptive LLM requests per user turn; counter resets on turn completion
  • Moved preemptiveGeneration into TurnHandlingOptions: The option is now configured via turnHandling.preemptiveGeneration alongside endpointing and interruption. The old top-level preemptiveGeneration: boolean on AgentSessionOptions is deprecated with a backward-compatible migration path.

  • Pipeline TTS deferral: In _pipelineReplyTaskImpl, TTS inference is now deferred until after waitForScheduled by default. When preemptiveTts: true, TTS starts immediately alongside the LLM (previous behavior).

  • Speech duration & retry guards in onPreemptiveGeneration: Preemptive generation is skipped if the user has been speaking longer than maxSpeechDuration, or if maxRetries attempts have already been made for the current turn.

  • PreemptiveGenerationInfo extended with startedSpeakingAt to enable the speech duration check.

Usage

const session = new AgentSession({
  turnHandling: {
    endpointing: { minDelay: 300 },
    interruption: { enabled: false },
    preemptiveGeneration: { preemptiveTts: true, maxRetries: 5 },
  },
});

Implementation nuances (JS vs Python)

Aspect Python JS/TS
Time units Seconds (max_speech_duration: 10.0) Milliseconds (maxSpeechDuration: 10_000)
TypedDict vs Interface PreemptiveGenerationOptions(TypedDict, total=False) interface PreemptiveGenerationOptions with Partial<> where needed
TTS stream teeing Python uses text_tee with lazy branch creation JS always tees upfront via ReadableStream.tee() but defers performTTSInference() call
Default resolution {**defaults, **config} dict merge { ...defaults, ...stripUndefined(config) } spread with undefined filtering
Counter field _preemptive_generation_count: int on class private _preemptiveGenerationCount = 0
Naming snake_case (preemptive_tts, max_speech_duration) camelCase (preemptiveTts, maxSpeechDuration)

Files changed

  • agents/src/voice/turn_config/preemptive_generation.tsnew: PreemptiveGenerationOptions interface and defaults
  • agents/src/voice/turn_config/turn_handling.ts — added preemptiveGeneration to TurnHandlingOptions and InternalTurnHandlingOptions
  • agents/src/voice/turn_config/utils.ts — updated migration logic, mergeWithDefaults, deprecated boolean migration
  • agents/src/voice/agent_session.ts — deprecated top-level preemptiveGeneration, removed from defaults
  • agents/src/voice/agent_activity.ts — added retry count, speech duration check, preemptive TTS deferral
  • agents/src/voice/audio_recognition.ts — added startedSpeakingAt to PreemptiveGenerationInfo, passed speechStartTime
  • agents/src/voice/remote_session.ts — updated serialization to use structured options from turnHandling
  • agents/src/voice/report.ts — added preemptive_generation to report output
  • agents/src/voice/report.test.ts — updated test defaults
  • agents/src/voice/turn_config/utils.test.ts — added preemptive generation migration tests

Test plan

  • All 645 tests in agents/src/ pass
  • Build succeeds (pnpm build:agents)
  • Lint passes (pnpm lint)
  • Formatting passes (pnpm format:check)
  • Verify restaurant_agent.ts works in Agent Playground with default options (preemptive generation enabled, TTS deferred)
  • Verify restaurant_agent.ts with preemptiveTts: true for full preemptive pipeline
  • Verify preemptiveGeneration: { enabled: false } correctly disables preemptive generation

cc @toubatbrian @livekit/agent-devs

Port of livekit/agents#5428. Adds PreemptiveGenerationOptions with
configurable options to reduce wasted compute during preemptive generation:
- preemptiveTts (default false): when false, only LLM runs preemptively
  and TTS starts after the turn is confirmed
- maxSpeechDuration (default 10s): skip preemptive generation when user
  has been speaking too long
- maxRetries (default 3): cap preemptive LLM requests per user turn,
  resets on turn completion

The preemptiveGeneration parameter now lives inside turnHandling options.
The old top-level boolean is deprecated with a migration path.

https://claude.ai/code/session_01C6K9wneUorBnm9eK2rZjpt
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 16, 2026

🦋 Changeset detected

Latest commit: e414064

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 25 packages
Name Type
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugins-test Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 16, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ toubatbrian
❌ claude
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

@toubatbrian toubatbrian requested a review from a team April 17, 2026 00:10
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

Open in Devin Review

@@ -1193,6 +1195,19 @@ export class AgentActivity implements RecognitionHooks {

this.cancelPreemptiveGeneration();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 cancelPreemptiveGeneration() called before guard checks causes premature cancellation of valid speculative result

cancelPreemptiveGeneration() at line 1196 unconditionally cancels the existing preemptive generation before the new maxSpeechDuration (line 1198-1203) and maxRetries (line 1205-1207) guard checks. When either guard triggers an early return, the previous valid preemptive generation has already been cancelled and _preemptiveGeneration is set to undefined (via cancelPreemptiveGeneration at agents/src/voice/agent_activity.ts:1242-1246). When userTurnCompleted later runs, it checks this._preemptiveGeneration !== undefined at line 1705 and finds nothing — forcing a non-preemptive fallback via generateReply(). This defeats the purpose of the feature: the last successful speculative generation (e.g., the Nth attempt before hitting maxRetries) is thrown away, increasing response latency. The unit tests don't catch this because cancelPreemptiveGeneration is mocked as a no-op vi.fn() in the test harness (agent_activity.test.ts:269).

Prompt for agents
In agents/src/voice/agent_activity.ts, the onPreemptiveGeneration method calls this.cancelPreemptiveGeneration() on line 1196 before checking the maxSpeechDuration and maxRetries guards (lines 1198-1207). When either guard triggers an early return, the existing preemptive generation (which was the best speculative result so far) has already been cancelled with no replacement.

The fix is to move this.cancelPreemptiveGeneration() after the guard checks, right before this._preemptiveGenerationCount++ on line 1209. This way, the existing preemptive generation is only cancelled when a new one is about to be created.

Alternatively, move the guard checks (maxSpeechDuration and maxRetries) above the cancelPreemptiveGeneration() call.

The test in agent_activity.test.ts should also be updated: the cancelPreemptiveGeneration mock (line 269) doesn't test the real interaction between cancel and the guards. Consider verifying that _preemptiveGeneration is preserved (not set to undefined) when the guards trigger an early return.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread examples/src/basic_agent.ts Outdated
@toubatbrian toubatbrian requested a review from theomonnom April 17, 2026 00:28
@toubatbrian toubatbrian merged commit 216f763 into main Apr 17, 2026
8 of 9 checks passed
@toubatbrian toubatbrian deleted the claude/practical-archimedes-mqsTa branch April 17, 2026 01:50
@github-actions github-actions Bot mentioned this pull request Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants