Summary
- Context: The Inworld TTS plugin uses a custom connection pool (
_ConnectionPool) that manages multiple _InworldConnection instances. Each connection tracks pending acquisitions via _pending_acquisitions to reserve capacity before context creation completes.
- Bug:
_ConnectionPool.acquire_context() catches Exception but not BaseException, so asyncio.CancelledError bypasses cleanup. This leaks the _pending_acquisitions counter AND, for newly created connections, the connection object itself.
- Actual vs. expected: When cancellation occurs during context acquisition, cleanup code is never executed.
_pending_acquisitions is never decremented, and newly created connections are never removed from the pool or closed. All cleanup should happen even during cancellation.
- Impact: For existing connections: phantom reservations accumulate, eventually causing capacity exhaustion. For newly created connections: the connection is leaked (never closed, remains in pool with leaked counter).
Impact Details
Primary Impact: Reservation Leak Causes Capacity Exhaustion
When cancellation occurs during pool.acquire_context():
_pending_acquisitions leaks - Each leaked reservation permanently reduces available capacity
has_capacity returns False prematurely - At line 704, the pool skips connections with phantom reservations
is_idle returns False - Connection appears busy even with no active contexts
The pool uses has_capacity at line 704 to route requests:
# Line 702-706: Pool routing logic
for existing in self._connections:
if not existing._closed and existing.has_capacity: # <-- Uses has_capacity
conn = existing
break
With leaked reservations:
has_capacity = (context_count + _pending_acquisitions) < MAX_CONTEXTS
- If
_pending_acquisitions = 5 (after 5 cancellations): has_capacity = (0 + 5) < 5 = False
After 5 cancellations on a connection, that connection becomes permanently unusable despite having no actual contexts.
Secondary Impact: Idle Cleanup Is BLOCKED (Not Enabled)
# Line 777-782: Idle connection cleanup
if (
conn.is_idle # <-- FALSE when _pending_acquisitions > 0
and now - conn.last_activity > self._idle_timeout
and len(self._connections) - len(connections_to_close) > 1
):
connections_to_close.append(conn)
A connection with leaked reservations has is_idle = False (line 239: return self.context_count == 0 and self._pending_acquisitions == 0). The leaked reservations PREVENT cleanup, not enable it.
Tertiary Impact: Connection Leak for created_new=True
When a NEW connection is created and then cancelled during acquire_context():
- Connection is added to pool at line 718
reserve_capacity() called at line 726
CancelledError during await conn.acquire_context() at line 731
- Connection cleanup (lines 736-740) is SKIPPED because
except Exception: doesn't catch CancelledError
- Connection remains in pool with leaked
_pending_acquisitions
- Connection is never closed (WebSocket connection leaked)
Code with bug
# In _ConnectionPool.acquire_context (lines 729-741)
if conn:
try:
ctx_id, waiter = await conn.acquire_context(emitter, opts, remaining_timeout)
except Exception: # <-- BUG CancelledError (BaseException) is NOT caught
# Release reservation since we didn't get a context
conn.release_reservation()
# Remove failed new connection from pool
if created_new:
async with self._pool_lock:
if conn in self._connections:
self._connections.remove(conn)
await conn.aclose()
raise
Evidence
Evidence 1: Caller's handler at line 1256 CANNOT help (CRITICAL)
Reviewer Claim: "The caller already handles CancelledError at line 1256"
FACT: The caller's handler is AFTER the pool call. When cancellation occurs INSIDE pool.acquire_context(), the handler is never reached.
The code flow is:
# Line 1221-1258 in SynthesizeStream._run()
pool = await self._tts._get_pool()
context_id, waiter, connection = await pool.acquire_context(...) # Line 1222 - CANCELLATION CAN OCCUR HERE
# ... code ...
try:
await asyncio.wait_for(waiter, timeout=self._conn_options.timeout + 60) # Line 1251
except asyncio.TimeoutError:
connection.close_context(context_id)
raise APITimeoutError() from None
except asyncio.CancelledError: # Line 1256 - HANDLER IS AFTER pool.acquire_context()
connection.close_context(context_id)
raise
When cancellation occurs INSIDE pool.acquire_context() (at line 1222), the try block at line 1251 hasn't been entered yet, so the handler at line 1256 NEVER EXECUTES.
Test at /home/user/agents/test_caller_handler_prove.py proves this definitively.
The caller's handler can only help if cancellation occurs AFTER pool.acquire_context() returns, not during.
Evidence 2: Integration test using ACTUAL _ConnectionPool and _InworldConnection classes
Test at /home/user/agents/test_real_pool_integration.py:
- Uses real
_ConnectionPool and _InworldConnection classes (not mocks)
- Demonstrates leak when cancelled DURING
connect() at line 286
Evidence 3: Capacity exhaustion demonstrated
Same test shows accumulation:
MAX_CONTEXTS = 5
Initial: _pending_acquisitions=0, has_capacity=True
Cancellation #1: _pending_acquisitions=1, has_capacity=True
Cancellation #2: _pending_acquisitions=2, has_capacity=True
Cancellation #3: _pending_acquisitions=3, has_capacity=True
Cancellation #4: _pending_acquisitions=4, has_capacity=True
Cancellation #5: _pending_acquisitions=5, has_capacity=False
After 5 cancellations:
_pending_acquisitions: 5
context_count: 0
has_capacity: False
is_idle: False
Evidence 4: Leaked reservations BLOCK cleanup, not enable it
# Line 239: is_idle definition
@property
def is_idle(self) -> bool:
return self.context_count == 0 and self._pending_acquisitions == 0
# Line 779: Cleanup condition
if conn.is_idle # <-- FALSE when _pending_acquisitions > 0
A connection with leaked reservations has is_idle = False, so it will NEVER be cleaned up by the idle connection pruner.
Evidence 5: Cancellation timing analysis of REAL code
The cancellation window in the REAL _InworldConnection.acquire_context() (lines 274-337):
# Line 286: await self.connect() <- CANCELLATION POINT #1 (network I/O)
# Line 300: async with self._acquire_lock: <- CANCELLATION POINT #2 (waiting for lock)
# Line 317: self.release_reservation() <- BUG WINDOW ENDS HERE
# Line 319: await self._outbound_queue.put() <- SAFE (reservation already released)
Evidence 6: Cancellation during connect() IS realistic in production
Network operations are the MOST COMMON cancellation points in production:
- Agent session termination - User closes browser/app while connection is being established
- Timeout race conditions -
asyncio.wait_for() timeout expires during connect()
- Graceful shutdown - Server receives SIGTERM while connections are being created
- Health check failures - Upstream monitoring cancels unhealthy connection attempts
Evidence 7: Framework's ConnectionPool uses except BaseException:
From livekit/agents/utils/connection_pool.py:88-92:
async def connection(self, *, timeout: float) -> AsyncGenerator[T, None]:
conn = await self.get(timeout=timeout)
try:
yield conn
except BaseException: # <-- CORRECT: catches CancelledError
self.remove(conn)
raise
else:
self.put(conn)
Cartesia plugin uses this framework pool (line 168 in cartesia/tts.py). Inworld's custom pool should follow the same pattern.
Evidence 8: Plugin already catches CancelledError elsewhere
Lines 620, 804, 1256 correctly catch asyncio.CancelledError. Line 732 is clearly an oversight.
Evidence 9: except Exception: vs except BaseException: is a known Python pitfall
Since Python 3.8+, asyncio.CancelledError inherits from BaseException, not Exception. This is documented behavior that all async code must handle correctly.
Evidence 10: _handle_connection_error does NOT help with leaked reservations
Reviewer Claim: "When ANY error occurs on a connection, _handle_connection_error is called and sets _closed = True"
FACT: _handle_connection_error is ONLY called when the connection actually encounters an error. A connection with leaked reservations but no active contexts:
- Still has a valid WebSocket connection
- Still has
_send_task and _recv_task running normally
- Will NEVER call
_handle_connection_error because nothing is wrong with the connection itself
The _handle_connection_error is called at lines 430 and 580 in the send/recv loops - but ONLY when those loops encounter errors. A connection with leaked reservations doesn't encounter errors; it just reports has_capacity=False and gets skipped by the pool router.
The connection appears "healthy" but unusable.
Evidence 11: Pool elasticity is NOT a fix - it's a workaround that compounds the problem
Reviewer Claim: "The pool creates new connections when existing ones appear at capacity"
FACT: This is a WORKAROUND, not a fix. Each new connection can also accumulate leaked reservations:
- Connection A: 5 cancellations →
has_capacity=False
- Pool creates Connection B
- Connection B: 5 cancellations →
has_capacity=False
- Pool creates Connection C
- ... continues until
max_connections is reached
With max_connections = 20 and MAX_CONTEXTS = 5, after 100 cancellations distributed across connections, the entire pool is exhausted.
The pool's "elasticity" just delays the inevitable and consumes more resources.
Evidence 12: The cancellation window is reachable in real scenarios
Reviewer Claim: "The cancellation window is extremely narrow - during first WebSocket handshake on a fresh connection"
FACT: The window is NOT just during initial connection:
-
Line 286 (await self.connect()) - Protected by _connect_lock, but the FIRST caller to each new connection WILL wait here. This is NOT a no-op for new connections.
-
Line 300 (async with self._acquire_lock) - Multiple concurrent requests to the same connection wait here. If one request is inside the lock and a cancellation wave arrives, other waiters are cancelled WHILE WAITING for the lock.
-
Line 329 (await asyncio.wait_for(self._context_available.wait(), timeout=remaining)) - If connection is at capacity, callers wait here. Cancellation during this wait also leaks the reservation.
The cancellation window exists EVERY TIME a new context is acquired, not just during initial connection.
Evidence 13: Framework-level cleanup via aclose() doesn't help for in-flight cancellations
Reviewer Claim: "When an AgentSession or TTS instance is closed, aclose() is called, which cancels all background tasks and closes connections"
FACT: aclose() is NOT called when a single request is cancelled. It's called when the ENTIRE TTS instance or session is shut down.
The bug occurs during normal operation when individual requests are cancelled (e.g., user interrupts, timeout expires). The TTS instance remains alive and continues using the pool with leaked reservations.
Evidence 14: The complete call stack shows NO ancestor catches CancelledError for the pool path
From SynthesizeStream._run() to pool.acquire_context():
SynthesizeStream._main_task() [tts.py:464]
└── for retry loop [line 473]
└── try: [line 475]
└── await self._run(output_emitter) [line 479]
└── pool.acquire_context() [inworld/tts.py:1222]
└── await conn.acquire_context() [line 731]
└── await self.connect() [line 286] <- CANCELLATION POINT
The framework's _main_task at line 480 catches except Exception, NOT BaseException:
try:
await self._run(output_emitter)
except Exception as e: # <-- Does NOT catch CancelledError
telemetry_utils.record_exception(attempt_span, e)
raise
And the retry loop at line 500 catches except APIError, not CancelledError.
NO ancestor in the call stack catches CancelledError for this path. The CancelledError propagates all the way up and terminates the task without cleanup.
Why has this bug gone undetected?
-
Cancellation is rare in tests - Unit tests complete normally. Edge cases require explicit design.
-
Impact is gradual - Each cancellation leaks one reservation. With MAX_CONTEXTS = 5 and max_connections = 20, the pool has 100 total capacity. The issue compounds slowly.
-
Pool creates new connections as fallback - When existing connections appear at capacity, new connections are created, masking the problem until max_connections is hit.
-
Symptom looks like timeout - Users see "Timed out waiting for available connection capacity" (line 760) with no obvious root cause.
-
Plugin already catches CancelledError elsewhere - Lines 620, 804, and 1256 correctly catch CancelledError. This is an oversight, not a misunderstanding.
-
except Exception: looks correct - Code review often misses that CancelledError is a BaseException.
Recommended fix
Change except Exception: to except BaseException: at line 732:
if conn:
try:
ctx_id, waiter = await conn.acquire_context(emitter, opts, remaining_timeout)
except BaseException: # <-- FIX catches CancelledError
conn.release_reservation()
if created_new:
async with self._pool_lock:
if conn in self._connections:
self._connections.remove(conn)
await conn.aclose()
raise
This matches the framework's ConnectionPool pattern and the plugin's own pattern at lines 620, 804, 1256.
Response to Round 7 Reviewer Objections
Objection 1: "Self-healing via _handle_connection_error"
Response: _handle_connection_error is ONLY called when the connection encounters an actual error (WebSocket failure, network error). A connection with leaked reservations:
- Has a valid, healthy WebSocket
- Has
_send_task and _recv_task running normally
- Will NEVER trigger
_handle_connection_error because nothing is wrong with it
The connection is "healthy but unusable" - it reports has_capacity=False and gets skipped by the pool router, but never gets cleaned up because it never errors.
Objection 2: "Pool mitigation through connection creation"
Response: This is a workaround that COMPOUNDS the problem. Each new connection can also accumulate leaked reservations. After enough cancellations, ALL connections in the pool are affected and max_connections is reached.
The pool's elasticity doesn't fix the leak - it just delays exhaustion and wastes resources (more WebSocket connections, more memory).
Objection 3: "Cancellation window is extremely narrow"
Response: The window exists EVERY TIME a context is acquired:
- Line 286:
await self.connect() - First caller to each connection
- Line 300:
async with self._acquire_lock - Waiters for lock
- Line 329:
await asyncio.wait_for(...) - Waiters for capacity
The window is NOT just "during initial WebSocket handshake." Any network wait or lock acquisition is a cancellation point.
Objection 4: "No actual production evidence"
Response: This is a code inspection task. The bug exists regardless of whether users have reported symptoms. The "Timed out waiting for available connection capacity" error (line 760) is the symptom users would see, but they wouldn't know the root cause.
Production evidence would require:
- Instrumentation to track
_pending_acquisitions
- Logging cancellations during connection acquisition
- Monitoring capacity degradation over time
The code defect is clear from inspection.
Objection 5: "Framework-level cleanup via aclose()"
Response: aclose() is called when the ENTIRE TTS instance is shut down, NOT when individual requests are cancelled. Normal cancellation (user interrupt, timeout) does NOT trigger aclose().
The bug occurs during normal operation with individual request cancellations.
Objection 6: "Zero other plugins use except BaseException:"
Response: The framework's ConnectionPool at livekit/agents/utils/connection_pool.py:90 uses except BaseException:. This is the pattern that Cartesia and other plugins use via utils.ConnectionPool.
Inworld has a CUSTOM pool implementation that should follow the same pattern. The framework's pattern IS the correct precedent.
Objection 7: "Other CancelledError catches are at different abstraction levels"
Response: IRRELEVANT. The presence of correct CancelledError handling at lines 620, 804, and 1256 proves the developers know the pattern. Line 732 is an oversight.
History
This bug was introduced in commit dcc9c2f (@cshape, 2026-01-21, PR #4533). The commit added a new _ConnectionPool class with connection pooling infrastructure for high-concurrency TTS scenarios. The developer used except Exception: at line 732 to handle failures during conn.acquire_context(), but this doesn't catch asyncio.CancelledError in Python 3.8+ (where CancelledError inherits from BaseException, not Exception). The bug slipped in because the developer correctly handled CancelledError in other parts of the same commit (lines 620, 804), but missed this case in the pool's exception handler.
Summary
_ConnectionPool) that manages multiple_InworldConnectioninstances. Each connection tracks pending acquisitions via_pending_acquisitionsto reserve capacity before context creation completes._ConnectionPool.acquire_context()catchesExceptionbut notBaseException, soasyncio.CancelledErrorbypasses cleanup. This leaks the_pending_acquisitionscounter AND, for newly created connections, the connection object itself._pending_acquisitionsis never decremented, and newly created connections are never removed from the pool or closed. All cleanup should happen even during cancellation.Impact Details
Primary Impact: Reservation Leak Causes Capacity Exhaustion
When cancellation occurs during
pool.acquire_context():_pending_acquisitionsleaks - Each leaked reservation permanently reduces available capacityhas_capacityreturns False prematurely - At line 704, the pool skips connections with phantom reservationsis_idlereturns False - Connection appears busy even with no active contextsThe pool uses
has_capacityat line 704 to route requests:With leaked reservations:
has_capacity = (context_count + _pending_acquisitions) < MAX_CONTEXTS_pending_acquisitions = 5(after 5 cancellations):has_capacity = (0 + 5) < 5 = FalseAfter 5 cancellations on a connection, that connection becomes permanently unusable despite having no actual contexts.
Secondary Impact: Idle Cleanup Is BLOCKED (Not Enabled)
A connection with leaked reservations has
is_idle = False(line 239:return self.context_count == 0 and self._pending_acquisitions == 0). The leaked reservations PREVENT cleanup, not enable it.Tertiary Impact: Connection Leak for
created_new=TrueWhen a NEW connection is created and then cancelled during
acquire_context():reserve_capacity()called at line 726CancelledErrorduringawait conn.acquire_context()at line 731except Exception:doesn't catchCancelledError_pending_acquisitionsCode with bug
Evidence
Evidence 1: Caller's handler at line 1256 CANNOT help (CRITICAL)
Reviewer Claim: "The caller already handles CancelledError at line 1256"
FACT: The caller's handler is AFTER the pool call. When cancellation occurs INSIDE
pool.acquire_context(), the handler is never reached.The code flow is:
When cancellation occurs INSIDE
pool.acquire_context()(at line 1222), thetryblock at line 1251 hasn't been entered yet, so the handler at line 1256 NEVER EXECUTES.Test at
/home/user/agents/test_caller_handler_prove.pyproves this definitively.The caller's handler can only help if cancellation occurs AFTER
pool.acquire_context()returns, not during.Evidence 2: Integration test using ACTUAL _ConnectionPool and _InworldConnection classes
Test at
/home/user/agents/test_real_pool_integration.py:_ConnectionPooland_InworldConnectionclasses (not mocks)connect()at line 286Evidence 3: Capacity exhaustion demonstrated
Same test shows accumulation:
Evidence 4: Leaked reservations BLOCK cleanup, not enable it
A connection with leaked reservations has
is_idle = False, so it will NEVER be cleaned up by the idle connection pruner.Evidence 5: Cancellation timing analysis of REAL code
The cancellation window in the REAL
_InworldConnection.acquire_context()(lines 274-337):Evidence 6: Cancellation during connect() IS realistic in production
Network operations are the MOST COMMON cancellation points in production:
asyncio.wait_for()timeout expires duringconnect()Evidence 7: Framework's ConnectionPool uses
except BaseException:From
livekit/agents/utils/connection_pool.py:88-92:Cartesia plugin uses this framework pool (line 168 in cartesia/tts.py). Inworld's custom pool should follow the same pattern.
Evidence 8: Plugin already catches CancelledError elsewhere
Lines 620, 804, 1256 correctly catch
asyncio.CancelledError. Line 732 is clearly an oversight.Evidence 9:
except Exception:vsexcept BaseException:is a known Python pitfallSince Python 3.8+,
asyncio.CancelledErrorinherits fromBaseException, notException. This is documented behavior that all async code must handle correctly.Evidence 10:
_handle_connection_errordoes NOT help with leaked reservationsReviewer Claim: "When ANY error occurs on a connection,
_handle_connection_erroris called and sets_closed = True"FACT:
_handle_connection_erroris ONLY called when the connection actually encounters an error. A connection with leaked reservations but no active contexts:_send_taskand_recv_taskrunning normally_handle_connection_errorbecause nothing is wrong with the connection itselfThe
_handle_connection_erroris called at lines 430 and 580 in the send/recv loops - but ONLY when those loops encounter errors. A connection with leaked reservations doesn't encounter errors; it just reportshas_capacity=Falseand gets skipped by the pool router.The connection appears "healthy" but unusable.
Evidence 11: Pool elasticity is NOT a fix - it's a workaround that compounds the problem
Reviewer Claim: "The pool creates new connections when existing ones appear at capacity"
FACT: This is a WORKAROUND, not a fix. Each new connection can also accumulate leaked reservations:
has_capacity=Falsehas_capacity=Falsemax_connectionsis reachedWith
max_connections = 20andMAX_CONTEXTS = 5, after 100 cancellations distributed across connections, the entire pool is exhausted.The pool's "elasticity" just delays the inevitable and consumes more resources.
Evidence 12: The cancellation window is reachable in real scenarios
Reviewer Claim: "The cancellation window is extremely narrow - during first WebSocket handshake on a fresh connection"
FACT: The window is NOT just during initial connection:
Line 286 (
await self.connect()) - Protected by_connect_lock, but the FIRST caller to each new connection WILL wait here. This is NOT a no-op for new connections.Line 300 (
async with self._acquire_lock) - Multiple concurrent requests to the same connection wait here. If one request is inside the lock and a cancellation wave arrives, other waiters are cancelled WHILE WAITING for the lock.Line 329 (
await asyncio.wait_for(self._context_available.wait(), timeout=remaining)) - If connection is at capacity, callers wait here. Cancellation during this wait also leaks the reservation.The cancellation window exists EVERY TIME a new context is acquired, not just during initial connection.
Evidence 13: Framework-level cleanup via
aclose()doesn't help for in-flight cancellationsReviewer Claim: "When an AgentSession or TTS instance is closed,
aclose()is called, which cancels all background tasks and closes connections"FACT:
aclose()is NOT called when a single request is cancelled. It's called when the ENTIRE TTS instance or session is shut down.The bug occurs during normal operation when individual requests are cancelled (e.g., user interrupts, timeout expires). The TTS instance remains alive and continues using the pool with leaked reservations.
Evidence 14: The complete call stack shows NO ancestor catches CancelledError for the pool path
From
SynthesizeStream._run()topool.acquire_context():The framework's
_main_taskat line 480 catchesexcept Exception, NOTBaseException:And the retry loop at line 500 catches
except APIError, notCancelledError.NO ancestor in the call stack catches
CancelledErrorfor this path. TheCancelledErrorpropagates all the way up and terminates the task without cleanup.Why has this bug gone undetected?
Cancellation is rare in tests - Unit tests complete normally. Edge cases require explicit design.
Impact is gradual - Each cancellation leaks one reservation. With
MAX_CONTEXTS = 5andmax_connections = 20, the pool has 100 total capacity. The issue compounds slowly.Pool creates new connections as fallback - When existing connections appear at capacity, new connections are created, masking the problem until
max_connectionsis hit.Symptom looks like timeout - Users see "Timed out waiting for available connection capacity" (line 760) with no obvious root cause.
Plugin already catches CancelledError elsewhere - Lines 620, 804, and 1256 correctly catch
CancelledError. This is an oversight, not a misunderstanding.except Exception:looks correct - Code review often misses thatCancelledErroris aBaseException.Recommended fix
Change
except Exception:toexcept BaseException:at line 732:This matches the framework's
ConnectionPoolpattern and the plugin's own pattern at lines 620, 804, 1256.Response to Round 7 Reviewer Objections
Objection 1: "Self-healing via
_handle_connection_error"Response:
_handle_connection_erroris ONLY called when the connection encounters an actual error (WebSocket failure, network error). A connection with leaked reservations:_send_taskand_recv_taskrunning normally_handle_connection_errorbecause nothing is wrong with itThe connection is "healthy but unusable" - it reports
has_capacity=Falseand gets skipped by the pool router, but never gets cleaned up because it never errors.Objection 2: "Pool mitigation through connection creation"
Response: This is a workaround that COMPOUNDS the problem. Each new connection can also accumulate leaked reservations. After enough cancellations, ALL connections in the pool are affected and
max_connectionsis reached.The pool's elasticity doesn't fix the leak - it just delays exhaustion and wastes resources (more WebSocket connections, more memory).
Objection 3: "Cancellation window is extremely narrow"
Response: The window exists EVERY TIME a context is acquired:
await self.connect()- First caller to each connectionasync with self._acquire_lock- Waiters for lockawait asyncio.wait_for(...)- Waiters for capacityThe window is NOT just "during initial WebSocket handshake." Any network wait or lock acquisition is a cancellation point.
Objection 4: "No actual production evidence"
Response: This is a code inspection task. The bug exists regardless of whether users have reported symptoms. The "Timed out waiting for available connection capacity" error (line 760) is the symptom users would see, but they wouldn't know the root cause.
Production evidence would require:
_pending_acquisitionsThe code defect is clear from inspection.
Objection 5: "Framework-level cleanup via
aclose()"Response:
aclose()is called when the ENTIRE TTS instance is shut down, NOT when individual requests are cancelled. Normal cancellation (user interrupt, timeout) does NOT triggeraclose().The bug occurs during normal operation with individual request cancellations.
Objection 6: "Zero other plugins use
except BaseException:"Response: The framework's
ConnectionPoolatlivekit/agents/utils/connection_pool.py:90usesexcept BaseException:. This is the pattern that Cartesia and other plugins use viautils.ConnectionPool.Inworld has a CUSTOM pool implementation that should follow the same pattern. The framework's pattern IS the correct precedent.
Objection 7: "Other
CancelledErrorcatches are at different abstraction levels"Response: IRRELEVANT. The presence of correct
CancelledErrorhandling at lines 620, 804, and 1256 proves the developers know the pattern. Line 732 is an oversight.History
This bug was introduced in commit dcc9c2f (@cshape, 2026-01-21, PR #4533). The commit added a new
_ConnectionPoolclass with connection pooling infrastructure for high-concurrency TTS scenarios. The developer usedexcept Exception:at line 732 to handle failures duringconn.acquire_context(), but this doesn't catchasyncio.CancelledErrorin Python 3.8+ (whereCancelledErrorinherits fromBaseException, notException). The bug slipped in because the developer correctly handledCancelledErrorin other parts of the same commit (lines 620, 804), but missed this case in the pool's exception handler.