feat(cli): add active session detection before destructive sandbox operations by jyaunches · Pull Request #2117 · NVIDIA/NemoClaw

jyaunches · 2026-04-20T21:24:46Z

Summary

Adds a shared sandbox-session-state.ts module that detects active SSH connections to OpenShell sandboxes before destructive operations proceed. This addresses the core issue in #992: no destructive operation in the CLI checks for active SSH sessions before proceeding, leading to abrupt Broken pipe errors with no graceful warning.

Architectural Context

This is part of the broader onboard.ts decomposition effort and JS→TS migration arc (see #2113 for the type-safety direction). The fix creates reusable infrastructure that improves destroy, rebuild, status, list, and connect.

The module follows the gateway-state.ts pattern: pure typed classifiers that parse CLI output are separated from the I/O layer with dependency injection for testability.

Changes

New module: src/lib/sandbox-session-state.ts — ESM TypeScript, no @ts-nocheck, explicit types, dependency-injected I/O
- parseForwardList() — parses openshell forward list output
- parseSshProcesses() — parses pgrep -a ssh output
- classifySessionState() — combines both sources
- getActiveSandboxSessions() — high-level entry point
- createSystemDeps() — factory for real system deps
src/nemoclaw.ts — wires session detection into 6 consumers:
- sandboxDestroy(): ⚠️ "Active SSH session detected. Destroy anyway?"
- sandboxRebuild(): ⚠️ Same warning pattern
- cleanupGatewayAfterLastSandbox(): Note about active port forwards
- sandboxConnect(): "Note: existing session detected (another terminal)"
- sandboxStatus(): Shows "Connected: yes (N sessions)" / "Connected: no"
- nemoclaw list: ● indicator next to connected sandboxes
src/lib/inventory-commands.ts — adds optional getActiveSessionCount to ListSandboxesCommandDeps

Type of Change

Code change (feature, bug fix, or refactor)

Test plan

27 unit tests in src/lib/sandbox-session-state.test.ts covering all pure classifiers and integration function
npx vitest run --project cli passes (existing tests unaffected)
npx tsc -p tsconfig.src.json --noEmit clean
All guards wrapped in try/catch — if detection fails, commands behave identically to before (zero regression risk)

Verification

npm test passes (pre-existing failure in install-preflight.test.ts unrelated)
No secrets, API keys, or credentials committed
Tests added or updated for new or changed behavior ✅ (27 new tests)

AI Disclosure

AI-assisted — tool: pi coding agent

Closes #992

Summary by CodeRabbit

New Features
- Sandbox list shows a connection indicator when active SSH sessions are detected.
- Status command reports active session counts per sandbox.
- Connect alerts if a sandbox already has active sessions.
- Destroy and rebuild prompt warn about active sessions and require confirmation when present.
Tests
- Added comprehensive tests for session detection, parsing, and classification.

…erations Adds a shared sandbox-session-state.ts module that detects active SSH connections to OpenShell sandboxes via pgrep and openshell forward list. Consumers: - sandboxDestroy(): warns before destroying sandbox with active sessions - sandboxRebuild(): warns before rebuilding sandbox with active sessions - sandboxConnect(): notes if already connected in another terminal - sandboxStatus(): shows Connected: yes/no with session count - nemoclaw list: shows ● indicator next to connected sandboxes - cleanupGatewayAfterLastSandbox(): notes active port forwards The module follows the gateway-state.ts pattern: pure typed classifiers separated from the I/O layer with dependency injection for testability. All guards are non-blocking (try/catch with graceful fallback). Closes #992

coderabbitai · 2026-04-20T21:24:58Z

📝 Walkthrough

Walkthrough

Adds sandbox SSH-session detection: parses OpenShell forward lists and system SSH processes, classifies active sessions, and integrates detection into listing, status, connect, destroy, and rebuild flows to show indicators and warn before destructive operations.

Changes

Cohort / File(s)	Summary
Session detection module & tests `src/lib/sandbox-session-state.ts`, `src/lib/sandbox-session-state.test.ts`	New module that parses `openshell forward list` and `ps`/`pgrep` SSH output into `ForwardEntry` and `SandboxSession`, classifies session state, exposes `getActiveSandboxSessions` and `createSystemDeps`; comprehensive unit tests added.
Inventory/listing integration `src/lib/inventory-commands.ts`	Extended `ListSandboxesCommandDeps` with optional `getActiveSessionCount` and updated listing output to append a connection marker (`●`) when per-sandbox active session count > 0.
CLI command integration `src/nemoclaw.ts`	Wired session detection into `listSandboxes`, `sandboxConnect`, `sandboxStatus`, `sandboxDestroy`, and `sandboxRebuild`: caches probe output, derives per-sandbox counts, shows warnings/indicators, and makes detection non-fatal on errors.

Sequence Diagram

sequenceDiagram
    participant User as User/CLI
    participant CLI as nemoclaw
    participant SessionDetection as Session Detection
    participant OpenShell as OpenShell CLI
    participant System as System (ps/pgrep)

    User->>CLI: Run command (list/status/connect/destroy/rebuild)
    CLI->>SessionDetection: getActiveSandboxSessions(sandboxName) / probe once for list
    SessionDetection->>OpenShell: spawnSync("openshell forward list")
    OpenShell-->>SessionDetection: forward list output
    SessionDetection->>System: spawnSync("ps -axo pid,command") or pgrep
    System-->>SessionDetection: SSH processes output
    SessionDetection->>SessionDetection: parseForwardList()<br/>parseSshProcesses()<br/>classifySessionState()
    SessionDetection-->>CLI: { detected, sessions, sessionCount }
    alt Active Sessions Detected
        CLI->>User: Show indicator/warning and require confirmation for destructive ops
    else No Active Sessions
        CLI->>User: Proceed normally
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I sniff the logs and lists with care,
I count the hops and tails of air.
Forwards, SSH, I tally right—
A tiny dot to show the light.
Beware before you smash the lair!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 68.75% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the primary change: adding active session detection before destructive sandbox operations, which aligns with the main objectives and code changes.
Linked Issues check	✅ Passed	The PR fully addresses issue `#992` by implementing session detection logic to warn users before destructive operations (destroy/rebuild) and provide session information in status commands.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the linked issue objectives: session-state detection module, inventory-command extension, and nemoclaw.ts consumer integration for warnings and indicators.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/992-active-session-guard

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/sandbox-session-state.ts`:
- Around line 113-117: The current check uses cmdline.includes(sshHost) which
matches substrings and can false-positively tie "openshell-my-sandbox-extended"
to "my-sandbox"; change the match to check argv tokens instead: parse the
command line string (cmdline) into argument tokens and test whether any token
=== sshHost (e.g., tokens.some(t => t === sshHost)) before pushing the session
into sessions (referencing cmdline, sshHost, sessions, sandboxName, pid). Ensure
splitting handles typical whitespace so only exact argument matches are
accepted.
- Around line 201-214: The code currently sets detected: true whenever
forwardOutput exists even if getSshProcesses() returned null; change the logic
so detection only succeeds when the SSH probe result (pgrepOutput from
getSshProcesses()) is available. Specifically, update the block around
forwardOutput and pgrepOutput to return {detected: false, sessions: []} if
pgrepOutput === null (even if forwardOutput is non-null), and only call
parseSshProcesses(pgrepOutput, sandboxName) and return detected: true when
pgrepOutput is non-null and parsing yields sessions.

In `@src/nemoclaw.ts`:
- Around line 1171-1187: The getActiveSessionCount callback currently shells out
via getActiveSandboxSessions for every sandbox row causing repeated probes;
instead, when sessionDeps is truthy (after resolveOpenshell/createSessionDeps),
run the probe once and cache its results (e.g., compute a map from sandbox name
to detected/session count by calling getActiveSandboxSessions over the full
inventory or by invoking the underlying openshell/pgrep probe once) and have
getActiveSessionCount simply return values from that cache; update the code
around resolveOpenshell, createSessionDeps, sessionDeps, and the
getActiveSessionCount passed into listSandboxesCommand to build and reuse the
cached probe results (and keep the existing try/catch behavior when reading the
cache).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: fc67d604-0fc4-4f60-9a95-a3cae24f989c

📥 Commits

Reviewing files that changed from the base of the PR and between a7dca9c and e8cc275.

📒 Files selected for processing (4)

src/lib/inventory-commands.ts
src/lib/sandbox-session-state.test.ts
src/lib/sandbox-session-state.ts
src/nemoclaw.ts

- Remove unused getForwardList() call from getActiveSandboxSessions — only pgrep/ps is needed for SSH session detection (warning #1) - Consolidate double-prompt in sandboxDestroy into single enriched confirmation prompt (warning #2) - Remove noisy cleanupGatewayAfterLastSandbox forward check that would always fire due to dashboard forward (warning #3) - Use word-boundary regex in parseSshProcesses to prevent false positives when sandbox names share prefixes (warning #4) - Export SessionClassification as named interface (suggestion #1) - Use cross-platform ps -axo instead of Linux-only pgrep -a for macOS compatibility (suggestion #2) - Add forwardCount to SessionClassification for future consumers - Add tests for word-boundary matching edge cases

Address CodeRabbit review: getActiveSessionCount was calling getActiveSandboxSessions (which spawns ps) once per sandbox row. Now the SSH process probe runs once and results are reused for all sandboxes via parseSshProcesses against cached output.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/lib/sandbox-session-state.ts (1)

104-121: ⚠️ Potential issue | 🟠 Major

Handle user@openshell-<sandbox> targets in SSH process matching.

The current host matcher misses valid SSH invocations like ssh user@openshell-my-sandbox, so active sessions can be undercounted and destructive warnings may be skipped.

Suggested fix

-  const sshHost = `openshell-${sandboxName}`;
-  // Match sshHost as a complete word — preceded by whitespace/start and followed
-  // by whitespace/end. This prevents `openshell-dev` from matching inside
-  // `openshell-dev-staging`.
-  const hostPattern = new RegExp(`(?:^|\\s)${escapeRegExp(sshHost)}(?:\\s|$)`);
+  const sshHost = `openshell-${sandboxName}`;
   const sessions: SandboxSession[] = [];
   const lines = pgrepOutput.split("\n").filter(Boolean);
@@
-    const cmdline = pidMatch[2];
-
-    if (hostPattern.test(cmdline)) {
+    const cmdline = pidMatch[2];
+    const argv = cmdline.trim().split(/\s+/);
+    const matchesHost = argv.some(
+      (token) => token === sshHost || token.endsWith(`@${sshHost}`),
+    );
+    if (matchesHost) {
       sessions.push({ sandboxName, pid, sshHost });
     }
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lib/sandbox-session-state.ts` around lines 104 - 121, The hostPattern
used to detect SSH processes misses targets of the form
user@openshell-<sandbox>; update the regex construction around sshHost (where
hostPattern and escapeRegExp are created) to allow an optional user@ prefix (and
optionally SSH URI prefixes like ssh://) when testing cmdline, so that lines
like "ssh user@openshell-my-sandbox" are matched and sessions.push({
sandboxName, pid, sshHost }) runs for those processes; adjust the RegExp used in
hostPattern accordingly and keep the rest of the pid parsing (pidMatch, pid,
cmdline) intact.

🧹 Nitpick comments (1)

src/lib/sandbox-session-state.test.ts (1)

70-131: Add a regression test for ssh user@openshell-<sandbox> format.

Please add a parser case for user@host targets so session detection stays correct for common SSH syntax and future refactors.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lib/sandbox-session-state.test.ts` around lines 70 - 131, The tests lack
a case for SSH targets using the user@host form (e.g. "ssh
user@openshell-<sandbox>"); add a unit test in sandbox-session-state.test.ts
calling parseSshProcesses with output containing a line like "PID ssh
user@openshell-my-sandbox" and asserting it returns a session with pid and
sshHost "openshell-my-sandbox". If parseSshProcesses’ current regex doesn’t
handle the "user@" prefix, update its matching logic (in the parseSshProcesses
function) to allow an optional "user@" before the host (e.g. match optional \S+@
and then the openshell-<sandbox> host or split on '@' to extract the host) so
the parser correctly detects user@host targets.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/sandbox-session-state.ts`:
- Around line 204-215: The comment about detecting active SSH sessions
references `pgrep -a ssh` but the implementation uses `ps -axo pid,command`;
update the doc comment in the `getSshProcesses`/session detection block (the
high-level function that "Detect active SSH sessions for a named sandbox") to
accurately describe that we parse `ps -axo pid,command` output (and the matching
logic against the sandbox SSH host) instead of `pgrep`, so future readers and
maintainers see the correct system command and parsing behavior.

---

Duplicate comments:
In `@src/lib/sandbox-session-state.ts`:
- Around line 104-121: The hostPattern used to detect SSH processes misses
targets of the form user@openshell-<sandbox>; update the regex construction
around sshHost (where hostPattern and escapeRegExp are created) to allow an
optional user@ prefix (and optionally SSH URI prefixes like ssh://) when testing
cmdline, so that lines like "ssh user@openshell-my-sandbox" are matched and
sessions.push({ sandboxName, pid, sshHost }) runs for those processes; adjust
the RegExp used in hostPattern accordingly and keep the rest of the pid parsing
(pidMatch, pid, cmdline) intact.

---

Nitpick comments:
In `@src/lib/sandbox-session-state.test.ts`:
- Around line 70-131: The tests lack a case for SSH targets using the user@host
form (e.g. "ssh user@openshell-<sandbox>"); add a unit test in
sandbox-session-state.test.ts calling parseSshProcesses with output containing a
line like "PID ssh user@openshell-my-sandbox" and asserting it returns a session
with pid and sshHost "openshell-my-sandbox". If parseSshProcesses’ current regex
doesn’t handle the "user@" prefix, update its matching logic (in the
parseSshProcesses function) to allow an optional "user@" before the host (e.g.
match optional \S+@ and then the openshell-<sandbox> host or split on '@' to
extract the host) so the parser correctly detects user@host targets.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0c430f29-add2-4eb7-975b-c73ce22f5c2d

📥 Commits

Reviewing files that changed from the base of the PR and between e8cc275 and e8978ce.

📒 Files selected for processing (3)

src/lib/sandbox-session-state.test.ts
src/lib/sandbox-session-state.ts
src/nemoclaw.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/nemoclaw.ts

coderabbitai · 2026-04-20T21:43:25Z

+  /** Run `pgrep -a ssh` and return stdout. Null if unavailable. */
+  getSshProcesses: () => string | null;
+}
+
+/**
+ * Detect active SSH sessions for a named sandbox.
+ *
+ * This is the high-level entry point used by consumers (destroy, rebuild, etc.).
+ * It invokes system commands through the deps interface for testability.
+ *
+ * Detection relies on `pgrep -a ssh` to find SSH processes targeting the
+ * sandbox's SSH host. The `getForwardList` dep is not used here (forward


⚠️ Potential issue | 🟡 Minor

Update stale probe documentation (pgrep vs ps).

Comments still say pgrep -a ssh, but the implementation uses ps -axo pid,command. This mismatch can mislead future maintenance/debugging.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/lib/sandbox-session-state.ts` around lines 204 - 215, The comment about detecting active SSH sessions references `pgrep -a ssh` but the implementation uses `ps -axo pid,command`; update the doc comment in the `getSshProcesses`/session detection block (the high-level function that "Detect active SSH sessions for a named sandbox") to accurately describe that we parse `ps -axo pid,command` output (and the matching logic against the sandbox SSH host) instead of `pgrep`, so future readers and maintainers see the correct system command and parsing behavior.

## Summary Catches up the user-facing reference and troubleshooting docs with the CLI and policy behavior changes that landed in v0.0.21. Drafted via the `nemoclaw-contributor-update-docs` skill against commits in `v0.0.20..v0.0.21`, filtered through `docs/.docs-skip`. ## Changes - **`docs/reference/commands.md`** - `nemoclaw list`: session indicator (●) for connected sandboxes (#2117). - `nemoclaw <name> connect`: active-session note; auto-recovery from SSH identity drift after a host reboot (#2117, #2064). - `nemoclaw <name> status`: three-state Inference line (`healthy` / `unreachable` / `not probed`) covering both local and remote providers; new `Connected` line (#2002, #2117). - `nemoclaw <name> destroy` and `rebuild`: active-session warning with second confirm; rebuild reapplies policy presets to the recreated sandbox (#2117, #2026). - `nemoclaw <name> policy-add` and `policy-remove`: positional preset argument and non-interactive flow via `--yes`/`--force`/`NEMOCLAW_NON_INTERACTIVE=1` (#2070). - `nemoclaw <name> policy-list`: registry-vs-gateway desync detection (#2089). - **`docs/reference/troubleshooting.md`** - `Reconnect after a host reboot`: now reflects automatic stale `known_hosts` pruning on `connect` (#2064). - `Running multiple sandboxes simultaneously`: onboard's forward-port collision guard (#2086). - New section: `openclaw config set` or `unset` is blocked inside the sandbox (#2081). - **`docs/network-policy/customize-network-policy.md`**: non-interactive `policy-add`/`policy-remove` form; preset preservation across rebuild (#2070, #2026). - **`docs/inference/use-local-inference.md`**: NIM section now covers the NGC API key prompt with masked input and `docker login nvcr.io --password-stdin` behavior (#2043). - **Generated skills regenerated** to pick up the source changes (`.agents/skills/nemoclaw-user-reference/references/{commands,troubleshooting}.md`, plus minor heading-flow deltas elsewhere). The pre-commit `Regenerate agent skills from docs` hook ran and confirmed source ↔ generated parity. Commits skipped per `docs/.docs-skip` or no doc impact: `bbbaa0fb` (skip-features), `7cb482cb` (skip-features), `8dee23fd` (skip-terms), plus the usual CI / test / refactor / install-plumbing churn. ## Type of Change - [ ] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [x] Doc only (includes code sample changes) ## Verification - [x] `npx prek run --all-files` passes for the modified files (the one failing test, `test/cli.test.ts > unknown command exits 1`, also fails on `origin/main` and is unrelated to these markdown-only changes) - [ ] `npm test` passes — skipped; same pre-existing CLI-dispatch test failure unrelated to docs - [ ] Tests added or updated for new or changed behavior — n/a, doc-only - [x] No secrets, API keys, or credentials committed - [x] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) — not run locally - [x] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) — n/a, no new pages ## AI Disclosure - [x] AI-assisted — tool: Claude Code --- Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>  ## Summary by CodeRabbit * **New Features** * Multi-session SSH connections with concurrent session support. * Three-state inference health reporting (healthy/unreachable/not probed) across all providers. * Automatic SSH host key rotation detection and recovery. * Non-interactive policy preset management via positional arguments. * Session indicators in sandbox list view. * **Bug Fixes** * Protected destructive operations with active-session warnings. * Policy presets now preserved during sandbox rebuilds. * **Documentation** * NGC registry authentication requirements for container images. * Multi-sandbox onboarding and reconnection guidance. * Troubleshooting updates for port conflicts and SSH issues.  Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/lib/sandbox-session-state.ts Outdated

Comment thread src/lib/sandbox-session-state.ts Outdated

Comment thread src/nemoclaw.ts

Test User added 2 commits April 20, 2026 17:37

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

cv approved these changes Apr 20, 2026

View reviewed changes

merge: update branch with main

55734fc

Test User and others added 2 commits April 20, 2026 19:23

merge: update branch with main

b5f339f

Merge branch 'main' into fix/992-active-session-guard

58aafcd

ericksoa merged commit 7732f5b into main Apr 21, 2026
11 checks passed

miyoungc mentioned this pull request Apr 21, 2026

docs: catch up reference and troubleshooting docs for v0.0.21 #2193

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): add active session detection before destructive sandbox operations#2117

feat(cli): add active session detection before destructive sandbox operations#2117
ericksoa merged 6 commits intomainfrom
fix/992-active-session-guard

jyaunches commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jyaunches commented Apr 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architectural Context

Changes

Type of Change

Test plan

Verification

AI Disclosure

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jyaunches commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading