Summary
The Copilot CLI's processQueuedItems() method lacks a try-catch/try-finally wrapper. If any unhandled exception occurs during message processing (or in the post-loop code that collects background task info), session.idle is never emitted. This causes sendAndWait() in all SDKs to hang until the 60-second timeout.
This may be a contributing factor behind #558 ("TimeoutError: Timeout after XXXs waiting for session.idle"), though the exact root cause has not been confirmed through a real CLI reproduction.
Note: This issue was filed on copilot-sdk for visibility, but the vulnerabilities described are in the Copilot CLI (@github/copilot). The correct tracker for a CLI fix would be github/copilot-cli.
Confidence Level
Medium confidence. The code analysis below is based on reading the minified CLI source (@github/copilot v1.0.3-0). The vulnerabilities identified are real code patterns, but:
Identified Vulnerabilities
Vulnerability 1: Missing try-catch in processQueuedItems()
Package: @github/copilot v1.0.3-0
The de-minified processQueuedItems() looks like this:
async processQueuedItems() {
if (this.isProcessing || this.itemQueue.length === 0) return;
this.isProcessing = true;
await this.initializeMcpHost();
while (this.itemQueue.length > 0) {
let item = this.itemQueue.shift();
// ... process commands and messages ...
// For messages: await this.runAgenticLoop(...)
}
// ⚠️ ALL OF THIS IS OUTSIDE ANY TRY-CATCH:
this.isProcessing = false;
let agents = this.taskRegistry.list({ includeCompleted: false }).filter(o => o.type === "agent");
let shells = this.shellContext?.getRunningCommandsInfo() ?? [];
let backgroundTasks = (agents.length > 0 || shells.length > 0) ? { agents: ..., shells: ... } : undefined;
this.emitEphemeral("session.idle", { backgroundTasks }); // ← Never reached if anything above throws
}
Impact: If taskRegistry.list(), getRunningCommandsInfo(), or any post-loop code throws, session.idle is never emitted AND isProcessing remains true, blocking all future messages.
Likelihood: Low. These are simple accessor methods unlikely to throw in normal operation. runAgenticLoop() has its own thorough try-catch-finally and does NOT re-throw, so errors during agentic processing don't escape to this level.
Vulnerability 2: Silent JSON-RPC notification drops
The event forwarding layer catches and silently drops notification failures:
for (let c of this.connections)
c.sendNotification(SESSION_EVENT, payload).catch(err => {
debug(`Failed to send event notification: ${err}`);
});
Impact: Even if session.idle IS emitted internally by the CLI, it can be lost in transit if the JSON-RPC connection has issues. This is logged at debug level only — invisible unless debug logging is enabled.
Likelihood: Medium. This is the more plausible explanation for intermittent failures in GitHub Actions, where process lifecycle and resource pressure could cause transport-level failures.
Suggested CLI Fix
Wrap processQueuedItems() in a try-finally that guarantees session.idle emission:
async processQueuedItems() {
if (this.isProcessing || this.itemQueue.length === 0) return;
this.isProcessing = true;
try {
await this.initializeMcpHost();
while (this.itemQueue.length > 0) {
// ... existing processing logic ...
}
let backgroundTasks;
try {
let agents = this.taskRegistry.list({ includeCompleted: false }).filter(o => o.type === "agent");
let shells = this.shellContext?.getRunningCommandsInfo() ?? [];
backgroundTasks = (agents.length > 0 || shells.length > 0)
? { agents: ..., shells: ... }
: undefined;
} catch (err) {
// Log but don't block idle emission
}
this.emitEphemeral("session.idle", { backgroundTasks });
} catch (err) {
// Ensure idle is emitted even on unexpected errors
this.emitEphemeral("session.idle", {});
} finally {
this.isProcessing = false;
}
}
What We Tested (SDK-Side Mock)
We built a mock CLI server that deliberately omits session.idle after assistant.turn_end. This does not reproduce the real CLI bug — it hard-codes the symptom to validate the SDK-side mitigation:
- Happy path (session.idle sent):
sendAndWait() resolves in ~130ms
- session.idle omitted:
sendAndWait() hangs until timeout (60s default), or resolves via SDK fallback timer (~5s)
The mock is useful for testing the SDK fix but is not evidence of a specific CLI bug.
SDK-Level Mitigation
We've implemented an idle fallback timer in all 4 SDKs (Node.js, Python, Go, .NET) as defense-in-depth. The timer fires a synthetic session.idle event if assistant.turn_end arrives but session.idle doesn't follow within a configurable grace period (default 5s). This prevents indefinite hangs regardless of root cause.
Related Issues
Summary
The Copilot CLI's
processQueuedItems()method lacks a try-catch/try-finally wrapper. If any unhandled exception occurs during message processing (or in the post-loop code that collects background task info),session.idleis never emitted. This causessendAndWait()in all SDKs to hang until the 60-second timeout.This may be a contributing factor behind #558 ("TimeoutError: Timeout after XXXs waiting for session.idle"), though the exact root cause has not been confirmed through a real CLI reproduction.
Confidence Level
Medium confidence. The code analysis below is based on reading the minified CLI source (
@github/copilotv1.0.3-0). The vulnerabilities identified are real code patterns, but:Identified Vulnerabilities
Vulnerability 1: Missing try-catch in
processQueuedItems()Package:
@github/copilotv1.0.3-0The de-minified
processQueuedItems()looks like this:Impact: If
taskRegistry.list(),getRunningCommandsInfo(), or any post-loop code throws,session.idleis never emitted ANDisProcessingremainstrue, blocking all future messages.Likelihood: Low. These are simple accessor methods unlikely to throw in normal operation.
runAgenticLoop()has its own thorough try-catch-finally and does NOT re-throw, so errors during agentic processing don't escape to this level.Vulnerability 2: Silent JSON-RPC notification drops
The event forwarding layer catches and silently drops notification failures:
Impact: Even if
session.idleIS emitted internally by the CLI, it can be lost in transit if the JSON-RPC connection has issues. This is logged atdebuglevel only — invisible unless debug logging is enabled.Likelihood: Medium. This is the more plausible explanation for intermittent failures in GitHub Actions, where process lifecycle and resource pressure could cause transport-level failures.
Suggested CLI Fix
Wrap
processQueuedItems()in a try-finally that guaranteessession.idleemission:What We Tested (SDK-Side Mock)
We built a mock CLI server that deliberately omits
session.idleafterassistant.turn_end. This does not reproduce the real CLI bug — it hard-codes the symptom to validate the SDK-side mitigation:sendAndWait()resolves in ~130mssendAndWait()hangs until timeout (60s default), or resolves via SDK fallback timer (~5s)The mock is useful for testing the SDK fix but is not evidence of a specific CLI bug.
SDK-Level Mitigation
We've implemented an idle fallback timer in all 4 SDKs (Node.js, Python, Go, .NET) as defense-in-depth. The timer fires a synthetic
session.idleevent ifassistant.turn_endarrives butsession.idledoesn't follow within a configurable grace period (default 5s). This prevents indefinite hangs regardless of root cause.Related Issues