Skip to content

fix(agents): release initMutex after warming to restore pool concurrency#1214

Merged
toubatbrian merged 2 commits intolivekit:mainfrom
lottiehq-oss:fix/proc-pool-init-mutex-serialization
Apr 8, 2026
Merged

fix(agents): release initMutex after warming to restore pool concurrency#1214
toubatbrian merged 2 commits intolivekit:mainfrom
lottiehq-oss:fix/proc-pool-init-mutex-serialization

Conversation

@drain-zine
Copy link
Copy Markdown
Contributor

@drain-zine drain-zine commented Apr 8, 2026

Description

ProcPool holds initMutex across proc.join(), which serialises the pool to effective concurrency 1 regardless of numIdleProcesses. Child procs spawned by JobProcExecutor are strictly one-shot:

  • JobProcExecutor.launchJob throws "process already has a running job" if called twice (agents/src/ipc/job_proc_executor.ts:111).
  • The child's message handler throws "job task already running" on a second startJobRequest (agents/src/ipc/job_proc_lazy_main.ts:264).
  • After the single job completes, the child calls process.exit(0) (agents/src/ipc/job_proc_lazy_main.ts:303).
  • The parent's SupervisedProc.#join Future only resolves on the child's 'exit' event (agents/src/ipc/supervised_proc.ts:163), so join() returns only after the child has exited.

Because initMutex was released in the outer finally after await proc.join(), the mutex was held for the entire lifetime of every running job. Since warming a replacement proc also requires initMutex, no replacement could initialise until the current job finished — making the pool effectively single-threaded.

We've been running this patch in production (via pnpm patch against @livekit/agents@1.2.3) for our voice agent. Before the patch: second concurrent sessions reliably timed out upstream regardless of numIdleProcesses, as the agent was blocked from dispatching workers to join the room. After the patch: concurrent sessions warm in parallel as expected at numIdleProcesses: 3.

Changes Made

  • Release initMutex immediately after warmedProcQueue.put() succeeds, before await proc.join().
  • Track release state with initReleased / procUnlockTransferred flags so the finally block doesn't double-unlock on the error path (e.g. if initialize() throws before enqueue).
  • Add two regression tests in agents/src/ipc/proc_pool.test.ts:
    • releases initMutex after warming so concurrent procWatchTasks can initialise — asserts initMutex is released while proc.join() is still pending, and that the finally doesn't double-release.
    • releases initMutex in finally when initialization fails before enqueue — asserts both locks are still released on the error path.

Pre-Review Checklist

  • Build passes: pnpm -w format:write, pnpm -w lint:fix, and pnpm --filter @livekit/agents exec vitest run src/ipc/proc_pool.test.ts all pass locally
  • AI-generated code reviewed: Comments tightened, no dead code
  • Changes explained: See above
  • Scope appropriate: Only proc_pool.ts and proc_pool.test.ts touched
  • Video demo: N/A — concurrency bug is not visually reproducible in Playground; covered by unit tests instead

Testing

  • Automated tests added/updated — two new regression tests in proc_pool.test.ts
  • All tests pass (5/5 in proc_pool.test.ts)
  • Make sure both restaurant_agent.ts and realtime_agent.ts work properly — N/A, change is internal to ProcPool and doesn't affect agent APIs

Additional Notes

We've been running this patch in production (via pnpm patch against @livekit/agents@1.2.3) for our LiveKit agent handling Twilio inbound calls. Before the patch: second concurrent call reliably failed with no-answer. After the patch: concurrent calls work as expected with numIdleProcesses: 3.

Happy to adjust the flag-based double-unlock guard if you'd prefer a different structure (e.g. nulling out the unlock reference instead).

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 8, 2026

🦋 Changeset detected

Latest commit: 1ffb239

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 22 packages
Name Type
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugins-test Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 8, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@toubatbrian
Copy link
Copy Markdown
Contributor

@drain-zine Can you update the changeset plz?

@drain-zine
Copy link
Copy Markdown
Contributor Author

drain-zine commented Apr 8, 2026

@drain-zine Can you update the changeset plz?

@toubatbrian have done so! let me know how it looks -> 1ffb239

@toubatbrian toubatbrian merged commit 3e65344 into livekit:main Apr 8, 2026
6 checks passed
@github-actions github-actions Bot mentioned this pull request Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants