feat(voice): add PreemptiveGenerationOptions for fine-grained control by toubatbrian · Pull Request #1265 · livekit/agents-js

toubatbrian · 2026-04-16T08:15:48Z

Summary

Port of livekit/agents#5428 — adds PreemptiveGenerationOptions with configurable options to reduce wasted compute during preemptive generation.

Changes

New PreemptiveGenerationOptions interface (turn_config/preemptive_generation.ts) with:
- enabled (default true): whether preemptive generation is active
- preemptiveTts (default false): when false, only LLM runs preemptively; TTS starts once the turn is confirmed and speech is scheduled
- maxSpeechDuration (default 10000 ms): skip preemptive generation when user has been speaking too long, since long utterances are more likely to change
- maxRetries (default 3): cap preemptive LLM requests per user turn; counter resets on turn completion
Moved preemptiveGeneration into TurnHandlingOptions: The option is now configured via turnHandling.preemptiveGeneration alongside endpointing and interruption. The old top-level preemptiveGeneration: boolean on AgentSessionOptions is deprecated with a backward-compatible migration path.
Pipeline TTS deferral: In _pipelineReplyTaskImpl, TTS inference is now deferred until after waitForScheduled by default. When preemptiveTts: true, TTS starts immediately alongside the LLM (previous behavior).
Speech duration & retry guards in onPreemptiveGeneration: Preemptive generation is skipped if the user has been speaking longer than maxSpeechDuration, or if maxRetries attempts have already been made for the current turn.
PreemptiveGenerationInfo extended with startedSpeakingAt to enable the speech duration check.

Usage

const session = new AgentSession({
  turnHandling: {
    endpointing: { minDelay: 300 },
    interruption: { enabled: false },
    preemptiveGeneration: { preemptiveTts: true, maxRetries: 5 },
  },
});

Implementation nuances (JS vs Python)

Aspect	Python	JS/TS
Time units	Seconds (`max_speech_duration: 10.0`)	Milliseconds (`maxSpeechDuration: 10_000`)
TypedDict vs Interface	`PreemptiveGenerationOptions(TypedDict, total=False)`	`interface PreemptiveGenerationOptions` with `Partial<>` where needed
TTS stream teeing	Python uses `text_tee` with lazy branch creation	JS always tees upfront via `ReadableStream.tee()` but defers `performTTSInference()` call
Default resolution	`{defaults, config}` dict merge	`{ ...defaults, ...stripUndefined(config) }` spread with undefined filtering
Counter field	`_preemptive_generation_count: int` on class	`private _preemptiveGenerationCount = 0`
Naming	`snake_case` (`preemptive_tts`, `max_speech_duration`)	`camelCase` (`preemptiveTts`, `maxSpeechDuration`)

Files changed

agents/src/voice/turn_config/preemptive_generation.ts — new: PreemptiveGenerationOptions interface and defaults
agents/src/voice/turn_config/turn_handling.ts — added preemptiveGeneration to TurnHandlingOptions and InternalTurnHandlingOptions
agents/src/voice/turn_config/utils.ts — updated migration logic, mergeWithDefaults, deprecated boolean migration
agents/src/voice/agent_session.ts — deprecated top-level preemptiveGeneration, removed from defaults
agents/src/voice/agent_activity.ts — added retry count, speech duration check, preemptive TTS deferral
agents/src/voice/audio_recognition.ts — added startedSpeakingAt to PreemptiveGenerationInfo, passed speechStartTime
agents/src/voice/remote_session.ts — updated serialization to use structured options from turnHandling
agents/src/voice/report.ts — added preemptive_generation to report output
agents/src/voice/report.test.ts — updated test defaults
agents/src/voice/turn_config/utils.test.ts — added preemptive generation migration tests

Test plan

All 645 tests in agents/src/ pass
Build succeeds (pnpm build:agents)
Lint passes (pnpm lint)
Formatting passes (pnpm format:check)
Verify restaurant_agent.ts works in Agent Playground with default options (preemptive generation enabled, TTS deferred)
Verify restaurant_agent.ts with preemptiveTts: true for full preemptive pipeline
Verify preemptiveGeneration: { enabled: false } correctly disables preemptive generation

cc @toubatbrian @livekit/agent-devs

Port of livekit/agents#5428. Adds PreemptiveGenerationOptions with configurable options to reduce wasted compute during preemptive generation: - preemptiveTts (default false): when false, only LLM runs preemptively and TTS starts after the turn is confirmed - maxSpeechDuration (default 10s): skip preemptive generation when user has been speaking too long - maxRetries (default 3): cap preemptive LLM requests per user turn, resets on turn completion The preemptiveGeneration parameter now lives inside turnHandling options. The old top-level boolean is deprecated with a migration path. https://claude.ai/code/session_01C6K9wneUorBnm9eK2rZjpt

changeset-bot · 2026-04-16T08:15:57Z

🦋 Changeset detected

Latest commit: e414064

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 25 packages

Name	Type
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-assemblyai	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-cerebras	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-lemonslice	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-mistral	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-phonic	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-sarvam	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugins-test	Patch
@livekit/agents-plugin-trugen	Patch
@livekit/agents-plugin-xai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

CLAassistant · 2026-04-16T08:15:57Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ toubatbrian
❌ claude
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

devin-ai-integration

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

devin-ai-integration · 2026-04-17T00:13:49Z

@@ -1193,6 +1195,19 @@ export class AgentActivity implements RecognitionHooks {

    this.cancelPreemptiveGeneration();


🔴 cancelPreemptiveGeneration() called before guard checks causes premature cancellation of valid speculative result

cancelPreemptiveGeneration() at line 1196 unconditionally cancels the existing preemptive generation before the new maxSpeechDuration (line 1198-1203) and maxRetries (line 1205-1207) guard checks. When either guard triggers an early return, the previous valid preemptive generation has already been cancelled and _preemptiveGeneration is set to undefined (via cancelPreemptiveGeneration at agents/src/voice/agent_activity.ts:1242-1246). When userTurnCompleted later runs, it checks this._preemptiveGeneration !== undefined at line 1705 and finds nothing — forcing a non-preemptive fallback via generateReply(). This defeats the purpose of the feature: the last successful speculative generation (e.g., the Nth attempt before hitting maxRetries) is thrown away, increasing response latency. The unit tests don't catch this because cancelPreemptiveGeneration is mocked as a no-op vi.fn() in the test harness (agent_activity.test.ts:269).

Prompt for agents

In agents/src/voice/agent_activity.ts, the onPreemptiveGeneration method calls this.cancelPreemptiveGeneration() on line 1196 before checking the maxSpeechDuration and maxRetries guards (lines 1198-1207). When either guard triggers an early return, the existing preemptive generation (which was the best speculative result so far) has already been cancelled with no replacement. The fix is to move this.cancelPreemptiveGeneration() after the guard checks, right before this._preemptiveGenerationCount++ on line 1209. This way, the existing preemptive generation is only cancelled when a new one is about to be created. Alternatively, move the guard checks (maxSpeechDuration and maxRetries) above the cancelPreemptiveGeneration() call. The test in agent_activity.test.ts should also be updated: the cancelPreemptiveGeneration mock (line 269) doesn't test the real interaction between cancel and the guards. Consider verifying that _preemptiveGeneration is preserved (not set to undefined) when the guards trigger an early return.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration Bot reviewed Apr 16, 2026

View reviewed changes

toubatbrian added 2 commits April 16, 2026 15:56

remove refs

beb469e

minor fix to make pipeline work

eedaa8b

toubatbrian requested a review from a team April 17, 2026 00:10

devin-ai-integration Bot reviewed Apr 17, 2026

View reviewed changes

theomonnom reviewed Apr 17, 2026

View reviewed changes

Comment thread examples/src/basic_agent.ts Outdated

update example config

e8739d6

toubatbrian requested a review from theomonnom April 17, 2026 00:28

Create strong-boxes-cross.md

e414064

theomonnom approved these changes Apr 17, 2026

View reviewed changes

toubatbrian merged commit 216f763 into main Apr 17, 2026
8 of 9 checks passed

toubatbrian deleted the claude/practical-archimedes-mqsTa branch April 17, 2026 01:50

github-actions Bot mentioned this pull request Apr 17, 2026

Version Packages #1271

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(voice): add PreemptiveGenerationOptions for fine-grained control#1265

feat(voice): add PreemptiveGenerationOptions for fine-grained control#1265
toubatbrian merged 5 commits intomainfrom
claude/practical-archimedes-mqsTa

toubatbrian commented Apr 16, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 16, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Apr 17, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@@ -1193,6 +1195,19 @@ export class AgentActivity implements RecognitionHooks {

		this.cancelPreemptiveGeneration();

Conversation

toubatbrian commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Implementation nuances (JS vs Python)

Files changed

Test plan

Uh oh!

changeset-bot Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

CLAassistant commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

toubatbrian commented Apr 16, 2026 •

edited

Loading

changeset-bot Bot commented Apr 16, 2026 •

edited

Loading

CLAassistant commented Apr 16, 2026 •

edited

Loading