fix(agents): release initMutex after warming to restore pool concurrency#1214
Merged
toubatbrian merged 2 commits intolivekit:mainfrom Apr 8, 2026
Merged
Conversation
🦋 Changeset detectedLatest commit: 1ffb239 The changes in this PR will be included in the next version bump. This PR includes changesets to release 22 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Contributor
|
@drain-zine Can you update the changeset plz? |
Contributor
Author
@toubatbrian have done so! let me know how it looks -> 1ffb239 |
toubatbrian
approved these changes
Apr 8, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
ProcPoolholdsinitMutexacrossproc.join(), which serialises the pool to effective concurrency 1 regardless ofnumIdleProcesses. Child procs spawned byJobProcExecutorare strictly one-shot:JobProcExecutor.launchJobthrows"process already has a running job"if called twice (agents/src/ipc/job_proc_executor.ts:111)."job task already running"on a secondstartJobRequest(agents/src/ipc/job_proc_lazy_main.ts:264).process.exit(0)(agents/src/ipc/job_proc_lazy_main.ts:303).SupervisedProc.#joinFuture only resolves on the child's'exit'event (agents/src/ipc/supervised_proc.ts:163), sojoin()returns only after the child has exited.Because
initMutexwas released in the outerfinallyafterawait proc.join(), the mutex was held for the entire lifetime of every running job. Since warming a replacement proc also requiresinitMutex, no replacement could initialise until the current job finished — making the pool effectively single-threaded.We've been running this patch in production (via
pnpm patchagainst@livekit/agents@1.2.3) for our voice agent. Before the patch: second concurrent sessions reliably timed out upstream regardless ofnumIdleProcesses, as the agent was blocked from dispatching workers to join the room. After the patch: concurrent sessions warm in parallel as expected atnumIdleProcesses: 3.Changes Made
initMuteximmediately afterwarmedProcQueue.put()succeeds, beforeawait proc.join().initReleased/procUnlockTransferredflags so thefinallyblock doesn't double-unlock on the error path (e.g. ifinitialize()throws before enqueue).agents/src/ipc/proc_pool.test.ts:releases initMutex after warming so concurrent procWatchTasks can initialise— assertsinitMutexis released whileproc.join()is still pending, and that thefinallydoesn't double-release.releases initMutex in finally when initialization fails before enqueue— asserts both locks are still released on the error path.Pre-Review Checklist
pnpm -w format:write,pnpm -w lint:fix, andpnpm --filter @livekit/agents exec vitest run src/ipc/proc_pool.test.tsall pass locallyproc_pool.tsandproc_pool.test.tstouchedTesting
proc_pool.test.tsproc_pool.test.ts)restaurant_agent.tsandrealtime_agent.tswork properly — N/A, change is internal toProcPooland doesn't affect agent APIsAdditional Notes
We've been running this patch in production (via
pnpm patchagainst@livekit/agents@1.2.3) for our LiveKit agent handling Twilio inbound calls. Before the patch: second concurrent call reliably failed withno-answer. After the patch: concurrent calls work as expected withnumIdleProcesses: 3.Happy to adjust the flag-based double-unlock guard if you'd prefer a different structure (e.g. nulling out the
unlockreference instead).