CLI: `processQueuedItems()` lacks error handling — `session.idle` may never be emitted

## Summary

The Copilot CLI's `processQueuedItems()` method lacks a try-catch/try-finally wrapper. If any unhandled exception occurs during message processing (or in the post-loop code that collects background task info), `session.idle` is never emitted. This causes `sendAndWait()` in all SDKs to hang until the 60-second timeout.

This may be a contributing factor behind #558 ("TimeoutError: Timeout after XXXs waiting for session.idle"), though the exact root cause has not been confirmed through a real CLI reproduction.

> **Note:** This issue was filed on copilot-sdk for visibility, but the vulnerabilities described are in the **Copilot CLI** (`@github/copilot`). The correct tracker for a CLI fix would be [github/copilot-cli](https://github.com/github/copilot-cli/issues).

## Confidence Level

**Medium confidence.** The code analysis below is based on reading the minified CLI source (`@github/copilot` v1.0.3-0). The vulnerabilities identified are real code patterns, but:

- We have **not triggered the bug in the real CLI** — only in a mock server that hard-codes the symptom
- We **don't know for certain** which (if any) of these vulnerabilities is the actual cause of #558
- The intermittent nature of #558 (primarily in GitHub Actions) suggests environmental factors that are difficult to reproduce locally

## Identified Vulnerabilities

### Vulnerability 1: Missing try-catch in `processQueuedItems()`

**Package**: `@github/copilot` v1.0.3-0

The de-minified `processQueuedItems()` looks like this:

```js
async processQueuedItems() {
    if (this.isProcessing || this.itemQueue.length === 0) return;

    this.isProcessing = true;
    await this.initializeMcpHost();

    while (this.itemQueue.length > 0) {
        let item = this.itemQueue.shift();
        // ... process commands and messages ...
        // For messages: await this.runAgenticLoop(...)
    }

    // ⚠️ ALL OF THIS IS OUTSIDE ANY TRY-CATCH:
    this.isProcessing = false;
    let agents = this.taskRegistry.list({ includeCompleted: false }).filter(o => o.type === "agent");
    let shells = this.shellContext?.getRunningCommandsInfo() ?? [];
    let backgroundTasks = (agents.length > 0 || shells.length > 0) ? { agents: ..., shells: ... } : undefined;
    this.emitEphemeral("session.idle", { backgroundTasks });  // ← Never reached if anything above throws
}
```

**Impact:** If `taskRegistry.list()`, `getRunningCommandsInfo()`, or any post-loop code throws, `session.idle` is never emitted AND `isProcessing` remains `true`, blocking all future messages.

**Likelihood: Low.** These are simple accessor methods unlikely to throw in normal operation. `runAgenticLoop()` has its own thorough try-catch-finally and does NOT re-throw, so errors during agentic processing don't escape to this level.

### Vulnerability 2: Silent JSON-RPC notification drops

The event forwarding layer catches and silently drops notification failures:

```js
for (let c of this.connections)
    c.sendNotification(SESSION_EVENT, payload).catch(err => {
        debug(`Failed to send event notification: ${err}`);
    });
```

**Impact:** Even if `session.idle` IS emitted internally by the CLI, it can be lost in transit if the JSON-RPC connection has issues. This is logged at `debug` level only — invisible unless debug logging is enabled.

**Likelihood: Medium.** This is the more plausible explanation for intermittent failures in GitHub Actions, where process lifecycle and resource pressure could cause transport-level failures.

### Suggested CLI Fix

Wrap `processQueuedItems()` in a try-finally that guarantees `session.idle` emission:

```js
async processQueuedItems() {
    if (this.isProcessing || this.itemQueue.length === 0) return;

    this.isProcessing = true;
    try {
        await this.initializeMcpHost();
        while (this.itemQueue.length > 0) {
            // ... existing processing logic ...
        }

        let backgroundTasks;
        try {
            let agents = this.taskRegistry.list({ includeCompleted: false }).filter(o => o.type === "agent");
            let shells = this.shellContext?.getRunningCommandsInfo() ?? [];
            backgroundTasks = (agents.length > 0 || shells.length > 0)
                ? { agents: ..., shells: ... }
                : undefined;
        } catch (err) {
            // Log but don't block idle emission
        }

        this.emitEphemeral("session.idle", { backgroundTasks });
    } catch (err) {
        // Ensure idle is emitted even on unexpected errors
        this.emitEphemeral("session.idle", {});
    } finally {
        this.isProcessing = false;
    }
}
```

## What We Tested (SDK-Side Mock)

We built a mock CLI server that deliberately omits `session.idle` after `assistant.turn_end`. This does **not** reproduce the real CLI bug — it hard-codes the symptom to validate the SDK-side mitigation:

- **Happy path** (session.idle sent): `sendAndWait()` resolves in ~130ms
- **session.idle omitted**: `sendAndWait()` hangs until timeout (60s default), or resolves via SDK fallback timer (~5s)

The mock is useful for testing the SDK fix but is **not evidence of a specific CLI bug**.

## SDK-Level Mitigation

We've implemented an idle fallback timer in all 4 SDKs (Node.js, Python, Go, .NET) as defense-in-depth. The timer fires a synthetic `session.idle` event if `assistant.turn_end` arrives but `session.idle` doesn't follow within a configurable grace period (default 5s). This prevents indefinite hangs regardless of root cause.

## Related Issues

- #558 — "TimeoutError: Timeout after XXXs waiting for session.idle" (intermittent in GitHub Actions)
- #389 — "Python: send_and_wait() always runs into timeout" (self-resolved, may or may not be related)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI: `processQueuedItems()` lacks error handling — `session.idle` may never be emitted #794

Summary

Confidence Level

Identified Vulnerabilities

Vulnerability 1: Missing try-catch in `processQueuedItems()`

Vulnerability 2: Silent JSON-RPC notification drops

Suggested CLI Fix

What We Tested (SDK-Side Mock)

SDK-Level Mitigation

Related Issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

CLI: processQueuedItems() lacks error handling — session.idle may never be emitted #794

Description

Summary

Confidence Level

Identified Vulnerabilities

Vulnerability 1: Missing try-catch in processQueuedItems()

Vulnerability 2: Silent JSON-RPC notification drops

Suggested CLI Fix

What We Tested (SDK-Side Mock)

SDK-Level Mitigation

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

CLI: `processQueuedItems()` lacks error handling — `session.idle` may never be emitted #794

Vulnerability 1: Missing try-catch in `processQueuedItems()`