feat: add sandbox export/import backup commands by vasanth53 · Pull Request #161 · NVIDIA/NemoClaw

vasanth53 · 2026-03-17T09:45:40Z

Summary

Add backup and restore functionality for sandboxes
New commands: nemoclaw backups, nemoclaw <name> export, nemoclaw import

New Commands

List Backups

nemoclaw backups

Export Sandbox

# Export to default location (~/.nemoclaw/backups)
nemoclaw my-sandbox export

# Export to custom path
nemoclaw my-sandbox export /path/to/backup.json

Import Sandbox

# Import with original name
nemoclaw import /path/to/backup.json

# Import with new name
nemoclaw import /path/to/backup.json new-sandbox-name

Features

Exports sandbox config, model, provider, policies
Exports current network policy from OpenShell
Stores backups in ~/.nemoclaw/backups/
Supports importing backups with new sandbox names

Testing

All 52 existing tests pass
Manually verified backup commands work correctly

Summary by CodeRabbit

Release Notes

New Features
- Added sandbox backup, export, and restore functionality for easier sandbox management.
- Introduced local Ollama inference support alongside NVIDIA Cloud API.
- Added uninstall command with Docker cleanup and artifact removal.
- Implemented sandbox name validation and improved onboarding workflow.
Documentation
- Added comprehensive troubleshooting guide for common installation and runtime issues.
- Updated Quick Start with prerequisites, Docker support matrix, and troubleshooting links.
- Marked product as "early preview" starting March 16, 2026.
Bug Fixes
- Improved Docker host detection for Colima and Docker Desktop on macOS.
- Fixed stale gateway cleanup and port conflict detection during onboarding.

Add completion command supporting bash, zsh, and fish shells. Completes issue NVIDIA#155.

Add backup and restore functionality for sandboxes: - nemoclaw backups - list all backups - nemoclaw <name> export - export sandbox to backup file - nemoclaw import <path> - import sandbox from backup

install.sh installed Node 24 via nvm, but scripts/install.sh (called by the openshell sub-installer) sets nvm default to Node 22. New shells picked up Node 22, but nemoclaw was linked under Node 24's npm prefix, causing "command not found" after reopening the terminal. Closes NVIDIA#162 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…XPERIMENTAL=1 (NVIDIA#168) Local options (NIM, vLLM, Ollama) were shown in both the host CLI and plugin CLI onboard menus even without the env var set. This contradicts the docs which removed all local model references. Now only build and ncp appear by default; local options require NEMOCLAW_EXPERIMENTAL=1. Closes NVIDIA#163 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…IDIA#195) setup-spark.sh chained into the deprecated setup.sh after the cgroup fix, which is the wrong path. Now it prints a success message directing the user to run `nemoclaw onboard`. Closes NVIDIA#165 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

setup-spark.sh crashed with "no such option: --break-system-packages" because it attempted to pip install vllm — a descoped feature. The cgroup fix succeeded but users thought it failed because the script continued into the vLLM section and crashed. Closes NVIDIA#164 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: KJ <kejones@nvidia.com>

* fix: validate and quote sandbox name in shell commands Sandbox name from user prompt was interpolated unquoted into openshell commands, causing failures when the name contained spaces or unexpected input. Added name validation (alphanumeric, hyphens, underscores only) and quoted all sandbox name interpolations in shell commands. Closes NVIDIA#166 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: also quote sandbox name in bin/nemoclaw.js shell commands The initial commit missed 5 unquoted ${sandboxName} interpolations in bin/nemoclaw.js (sandboxConnect, sandboxStatus, sandboxLogs, sandboxDestroy). These have the same shell injection / argument parsing vulnerability as the onboard.js ones fixed in the previous commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: auto-lowercase sandbox names and enforce RFC 1123 rules Sandbox creation failed with capital letters because Kubernetes requires RFC 1123 subdomains (lowercase alphanumeric + hyphens, must start/end with alphanumeric). Now the name is auto-lowercased and the prompt shows the naming rules. Closes NVIDIA#202 --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ves (NVIDIA#69) * fix: detect sandbox context in status command to prevent false negatives When running inside an active OpenShell sandbox, `openshell` host commands are unavailable, causing `openclaw nemoclaw status` to silently report "not running" and "Not configured" even though the sandbox is active and inference is working. This adds sandbox detection so the status command reports accurate context instead of false negatives. Closes NVIDIA#25 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add vitest unit tests for status command sandbox detection Adds 22 unit tests covering all status command scenarios: - Host with no openshell (blank state) - Host with running sandbox and configured inference - Host with running sandbox but no inference - Inside sandbox (core bug fix — no false negatives) - Inside sandbox with prior plugin state and rollback - Edge cases (custom sandbox name, detection via either marker dir, missing uptime) Also sets up vitest as the project test framework. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The sandbox name was interpolated unquoted into shell commands passed to openshell policy set/get. This caused argument parsing failures when the sandbox name or prior shell state introduced unexpected tokens (e.g. 'mkdir' from DGX Spark sudo context). Extracts buildPolicySetCommand/buildPolicyGetCommand helpers that properly quote all arguments, and adds tests to verify. Fixes NVIDIA#46 Signed-off-by: Aaron Erickson <aerickson@nvidia.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@dnandakumar-nv

…or (NVIDIA#232) currentPolicy was declared const but reassigned when prepending a version field to policies missing one. This throws TypeError at runtime for sandboxes with existing policies that lack a version field. Spotted by @dnandakumar-nv in NVIDIA#49.

Fixes NVIDIA#205. Root cause: when the installer runs via `curl | bash`, nvm sources into the subshell and updates PATH. When the subshell exits, all PATH changes are lost — the user's interactive shell still has the old PATH, so `nemoclaw` and `node` are both "command not found". Additionally, `onboard.js` called the full `scripts/install.sh` just to install openshell, which unnecessarily reinstalled Node, Docker, and nemoclaw in a nested subprocess — compounding the PATH issue. Changes: scripts/install.sh: - Add ensure_nvm_loaded() and refresh_path() to update PATH after npm installs nemoclaw - Add nvm fallback detection for $HOME/.nvm when NVM_DIR is unset - Soften verification: warn with instructions instead of hard-failing when the binary exists but isn't on PATH (exit 0, not exit 1) - Add post-install message telling nvm/fnm users to `source ~/.bashrc` - Use sudo for npm install -g when NODE_MGR=nodesource install.sh (root): - Soften verify_nemoclaw() to return 0 when binary exists but PATH is stale (was hard error via error()) - Add post_install_message() for shell reload instructions bin/lib/onboard.js: - Switch installOpenshell() to use dedicated install-openshell.sh instead of the full scripts/install.sh scripts/install-openshell.sh (new): - Dedicated openshell-only installer with macOS + Linux support and conditional sudo Tested: 58/58 unit tests pass, 24/24 E2E tests pass (see PR NVIDIA#226).

…A#226) Adds a test that proves the complete user journey with real infrastructure — no mocks. Sends prompts through the sandbox and verifies responses from the NVIDIA Cloud API. Phases: prerequisites → install → onboard → sandbox verification → live inference (direct API + through sandbox) → cleanup Requires: Docker, openshell, NVIDIA_API_KEY, network access. Ref: NVIDIA#205

* fix: resolve GPU sandbox creation failures on DGX machines (NVIDIA#235) Three bugs caused onboarding to silently fail on DGX when GPU was detected: 1. Pipe masked sandbox creation exit code: `openshell sandbox create ... | awk` exits with awk's status (0), not openshell's, so GPU errors were ignored and NemoClaw reported success while no sandbox existed — causing "sandbox not found" in Step 7. Fixed with `set -o pipefail`. 2. GPU device plugin race on DGX: the k3s GPU device plugin needs extra time to register GPUs with the Kubernetes scheduler after the gateway HTTP endpoint becomes healthy. The 5-second DNS sleep was not enough. Added a polling loop (up to 2 minutes) for GPU allocatability before sandbox creation. 3. Stale port 18789 forward from a previous onboard blocked the new sandbox's dashboard port. Added explicit `openshell forward stop 18789` cleanup before the new forward. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: stop passing --gpu to gateway and sandbox on DGX Spark The official NVIDIA DGX Spark playbooks do not use --gpu on openshell gateway start or sandbox create. On Spark, inference is routed through a host-side provider (Ollama, vLLM, or cloud API) via openshell's inference routing layer — the sandbox does not need direct GPU access. Passing --gpu causes FailedPrecondition errors because the gateway's k3s GPU device plugin cannot allocate GPUs on Spark, which cascades into sandbox creation failures and broken policy application. Removes the GPU allocatability poll loop (no longer needed) while preserving the pipefail and stale port forward fixes from the parent commit. Refs: NVIDIA#239 See: https://build.nvidia.com/spark/nemoclaw/instructions --------- Co-authored-by: NemoClaw Dev <dev@nemoclaw.local> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Aaron Erickson <aerickson@nvidia.com>

Fixes the README.md's note block to render properly in github markdown.

…VIDIA#242) * update contributing and rename doc update skill * rewrite intro * add filler sentences * add CoC, small fix in README for docs * fix rendering of note * add a note

The cgroup v2 / cgroupns=host issue has been fixed on the OpenShell side, so NemoClaw no longer needs to check for it during onboarding. Removes bin/lib/preflight.js, its tests, and the check in onboard.js.

* add PR CI workflows Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> * changed npm to install Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> * merged both PR checks into one Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> * moved variables from run to env Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> * scoping read permissions Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> * split pr closer into a separate file Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> * changed action to pull_request_target Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> --------- Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>

…#251) * preflight: detect gateway port 18789 conflicts during onboard Add checkPortAvailable() to preflight checks. Uses lsof (primary) to identify the blocking process name and PID, with a Node.js net probe fallback for cross-platform EADDRINUSE detection. Integrated into onboard step 1 before gateway operations so users get a clear error instead of a dead dashboard. Closes NVIDIA#18 * preflight: fall through to net probe when lsof returns empty Non-root lsof cannot see root-owned listeners (/proc/[pid]/fd returns EACCES). Empty lsof output is not authoritative — fall through to the net probe whose bind() checks the kernel socket table directly. Added test covering the exact scenario: lsof empty + port occupied = net probe catches EADDRINUSE. * test: await srv.close() in finally blocks to prevent handle leakage * preflight: also check gateway port 8080 before onboard Check both required ports (8080 for OpenShell gateway, 18789 for dashboard) during preflight instead of only 18789. Uses the same lsof + net-probe detection chain with per-port error messaging.

…path (NVIDIA#222) * fix: propagate sandbox name to bridge and resolve openshell path Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * fix: validate resolveOpenshell returns absolute path Reject non-absolute paths from command -v (e.g. aliases or functions) and fall through to the explicit candidate list. Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * fix: pass default sandbox name from registry to start-services.sh start-services.sh unconditionally set SANDBOX_NAME to "default", ignoring any exported value. nemoclaw start also never passed the sandbox name from the registry. Together these caused the telegram bridge to target the wrong sandbox. - Preserve existing SANDBOX_NAME in start-services.sh before defaulting - Pass the registry's default sandbox from nemoclaw start Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * test: add resolveOpenshell and SANDBOX_NAME defaulting tests Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * fix: address CodeRabbit review comments - Remove unused fs/path imports from test file - Extract resolveOpenshell to shared module with DI for direct testing - Validate sandbox name before shell interpolation * fix: harden HOME fallback and make SANDBOX_NAME tests hermetic Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> --------- Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>

* feat: add non-interactive mode for CI/CD onboarding Closes NVIDIA#250. Adds --non-interactive flag and NEMOCLAW_NON_INTERACTIVE=1 env var to nemoclaw onboard and install.sh. When active, all interactive prompts use environment variable overrides or sensible defaults: NEMOCLAW_SANDBOX_NAME — sandbox name (default: my-assistant) NEMOCLAW_PROVIDER — inference provider: cloud, ollama, vllm, nim NEMOCLAW_MODEL — model override NVIDIA_API_KEY — required for cloud/nim, not needed for ollama/vllm Non-interactive mode auto-recreates existing sandboxes and applies suggested policy presets without prompting. Usage: NEMOCLAW_NON_INTERACTIVE=1 NVIDIA_API_KEY=nvapi-... ./install.sh nemoclaw onboard --non-interactive ./install.sh --non-interactive * fix: handle NEMOCLAW_PROVIDER before options list in non-interactive mode When only cloud was in the options list (no GPU, not experimental), the options.length > 1 branch was skipped entirely, so the NEMOCLAW_PROVIDER env var was never checked. Ollama/vLLM provider selection now happens before the interactive options are built. * remove stale cgroup preflight regression * fix non-interactive onboarding regressions * add standalone openshell installer script * add env-controlled non-interactive policy selection * avoid sudo prompt for non-interactive openshell install * avoid sudo prompt during non-interactive uninstall * harden non-interactive onboard validation * simplify policy retry loop --------- Co-authored-by: Kevin Jones <kejones@nvidia.com>

…gular place (NVIDIA#331) * doc refresh and start troubleshooting page * refresh doc and start collecting troubleshooting * openshell command fix

…DIA#340) * fix: parameterize sandbox name in setup.sh instead of hardcoding Closes NVIDIA#197. setup.sh hardcoded --name nemoclaw for sandbox creation, ignoring the custom name chosen during onboard. This caused a split-brain: the NemoClaw registry stored the custom name but OpenShell had a sandbox named "nemoclaw", so connect/policy/bridge commands failed. setup.sh now accepts the sandbox name as $1 (default: nemoclaw). The legacy `nemoclaw setup` command passes the default sandbox name from the registry. Gateway name stays hardcoded to "nemoclaw" since gateway and sandbox names are independent. * test: add regression tests for setup.sh sandbox name parameterization 7 tests verifying sandbox operations use $SANDBOX_NAME while gateway stays hardcoded, plus bash arg parsing tests. Refs: NVIDIA#197

…ction (NVIDIA#295) * centralize runtime detection and installer fallbacks * fail fast on podman for macos * tighten local inference setup paths * add macos install smoke test * fix smoke macos install cleanup trap * add runtime selection to macos smoke test * improve colima coredns upstream detection * avoid consuming stdin in colima dns probe * align smoke answers with preset prompt * close stdin for onboarding child commands * stage smoke answers through a pipe * avoid smoke log hangs from inherited stdout * allow prompt-driven installs to exit under piped stdin * silence smoke answer feeder shutdown * tighten macos runtime selection and dns patching * clarify apple silicon macos support * make curl installer self-contained * fix macos installer preflight test on linux ci * clarify runtime prerequisite wording * preserve interactive prompts after first answer * remove openshell gateway volume on uninstall * restore tty for sandbox connect * sync sandbox inference config from onboard selection * sync sandbox config via connect stdin * register ollama models with explicit provider ids * add stable nemoclaw shim for nvm installs * register ollama provider config in sandbox * route sandbox inference through managed provider * fix sandbox provider config serialization * probe container reachability for local inference * require explicit local provider selection * align provider identity across managed inference UI * make validated ollama path first-class * add curated cloud model selector * add ollama model selector * warm selected ollama model and quiet sandbox sync output * fail onboarding when selected ollama model is unhealthy * fix: respect non-interactive mode in Ollama model selection The Ollama branch in setupNim() called promptOllamaModel() directly, ignoring non-interactive mode. This caused onboard --non-interactive with NEMOCLAW_PROVIDER=ollama to hang waiting for input. Now checks isNonInteractive() and uses NEMOCLAW_MODEL env var or getDefaultOllamaModel() fallback instead of prompting. --------- Aaron Erickson <aerickson@nvidia.com>

* Create bug_report.yml * Update bug_report.yml

* reattach curl installer onboarding to tty * update installer tests for non-interactive tty guard * open tty explicitly for piped onboarding * run stdin installer test in non-interactive mode

* docs: add common installation troubleshooting entries Add four new entries to the Installation section of the troubleshooting guide covering issues frequently reported by new users: - Node.js version below 20 (with nvm fix) - Docker daemon not running (Linux + macOS Docker Desktop) - npm EACCES permission errors (npm-global prefix workaround) - Port 18789 already in use (lsof/kill steps) Also add a Troubleshooting link at the end of the quickstart guide so users who hit problems during onboarding can find the guide immediately. Closes NVIDIA#364 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: improve port-conflict troubleshooting step Add a note to verify the process before terminating it, and mention kill -9 as a fallback for processes that do not exit gracefully. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: disable MD014 for troubleshooting doc The entire file uses $ prompts in console blocks without output, which is the established style for this doc. Disable MD014 at the file level to suppress linting warnings consistently. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…and (NVIDIA#370) Closes NVIDIA#322, closes NVIDIA#323. - Replace "requires a fresh installation of OpenClaw" with "creates a fresh OpenClaw instance inside the sandbox during onboarding" — the old note implied a manual step that doesn't exist - Remove nemoclaw deploy from Key Commands table — remote deployment is not in the current supported scope

* docs: link DGX Spark setup guide from README Closes NVIDIA#321. Spark users following the main Quick Start hit opaque errors because the Spark-specific prerequisites (cgroup v2, Docker config) are only documented in spark-install.md. Add a tip box before the install section directing Spark users to the right guide. * Update README.md --------- Co-authored-by: Miyoung Choi <miyoungc@nvidia.com>

…me truncation (NVIDIA#229) * fix: gate sandbox registration on readiness and surface creation failures The sandbox create command was piped through awk to deduplicate log lines. In bash, the exit status of a pipeline is the status of the last command (awk, always 0), so creation failures were silently swallowed. NemoClaw then registered a phantom sandbox in ~/.nemoclaw/sandboxes.json that caused "sandbox not found" on every subsequent connect/status call. This is the root cause of the WSL2 + Docker Desktop failures reported in NVIDIA#140 and NVIDIA#152 — sandbox creation fails due to Docker networking issues, but onboarding completes as if it succeeded. Three changes: 1. Remove the awk pipe so the real exit code flows through to run() 2. Poll openshell sandbox list for Ready state before registering (matches the gateway health check pattern at lines 121-132) 3. Move build-context cleanup before the exit-code check so temp files are always cleaned up, even on failure Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * fix: use word-boundary match for Ready status and fix timeout includes("Ready") falsely matched "NotReady" because "Ready" is a substring. Use a word-boundary regex with a NotReady exclusion so sandboxes stuck in error states are not registered as healthy. Also remove the off-by-one break at i=29 so the loop sleeps the full 60s before timing out. Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * fix: exact-match sandbox name in readiness check Use column-based matching (split on whitespace, check cols[0]) instead of substring includes(). Prevents false positives when one sandbox name is a prefix of another (e.g. "my" matching "my-assistant"). Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * test: add readiness gate parsing tests for sandbox creation Signed-off-by: Brian Taylor <brian.taylor818@gmail.com> * fix: guard against truncated sandbox names on WSL (fixes NVIDIA#21) On WSL, hyphenated sandbox names like "my-assistant" can be truncated to "m" during shell argument parsing, causing "sandbox not found" failures when applying policy presets. - Add RFC 1123 validation in applyPreset() to catch truncated names early with a clear error message - Quote sandbox name in error output (was unquoted on line 356) - Add 6 WSL-specific regression tests covering hyphenated names, multi-hyphen names, truncation detection, and command quoting * fix: clean up orphaned sandbox on readiness timeout When the sandbox is created but never reaches Ready within 60s, delete it before exiting so the next onboard retry with the same name doesn't fail on "sandbox already exists". * chore: remove issue references from code comments * fix: enforce RFC 1123 63-char limit in sandbox name validation * fix: extract isSandboxReady as shared function and branch cleanup messaging - Extract readiness parser from inline code to exported isSandboxReady() so tests validate the production code, not a duplicated copy - Branch cleanup messaging on delete result — report manual cleanup command if the orphan delete fails * fix: clean up stale gateway and port forward before preflight check A previous onboard session may leave the OpenShell gateway container and port forward running, causing port 8080/18789 conflicts on the next invocation. Detect a NemoClaw-owned gateway before the port availability check and tear it down automatically. Closes NVIDIA#397 * test: add double-onboard e2e test for stale state recovery Runs `nemoclaw onboard` three times without cleanup between runs, verifying that each subsequent onboard recovers automatically from stale gateway, port forward, and registry entries left behind. Regression test for NVIDIA#397. --------- Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>

…EADME to docs workflow (NVIDIA#392) * fix link and add README to docs workflow * workaround for rendering correctly in both raw md and myst

* add coderabbit instructions for docs * coderabbit suggestion

* add discord channel to docs * shorter preview text * add discord link to more places * resolve warning * use invite link * update url * nit

* add uninstall section to quickstart * add link to the script * remove $ from code blocks and explain which terminal in text

* ci: add nightly full E2E with install.sh, policy enforcement, and CLI tests * fix: pass GITHUB_TOKEN for openshell install via gh CLI * fix: avoid tee pipe hang from openshell background port-forward * fix: use SSH for sandbox command execution test * refactor: remove OpenShell-tested checks from NemoClaw E2E * fixed section header Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com> * ci: skip nightly E2E on forks without secrets * fix: fail-fast on cd and include status output in failure message * fix: validate command exit status before assertions and use fixed-string grep --------- Signed-off-by: Jaya Venkatesh <jjayabaskar@nvidia.com>

* fix uninstall script stdin entrypoint guard * fix uninstall entrypoint guard when sourced from bash * fix uninstall cleanup order and tty prompt * docs add curl uninstall command to readme * feat add nemoclaw uninstall wrapper * fix uninstall docs pin and cli exit handling * docs restore main uninstall url in readme * docs remove shell prompts from readme commands * docs match uninstall code fence style

* update messaging * bring the intro sentence back

…) (NVIDIA#436) `openshell sandbox logs` does not exist; the correct command is the `openshell logs`. Also fixes the follow flag from --follow to --tail to match the openshell CLI. Signed-off-by: Paritosh Dixit <paritoshd@nvidia.com>

* docs: remove duplicated content in README Rephrase note about using CLI for long responses. * rm link

* add pr template * rm redundant item * improve

* improve * add issue templates and improve pr template

wscurran · 2026-03-19T22:55:07Z

I've taken a look at the changes and see that this PR introduces new commands for listing backups, exporting sandboxes, and importing them, which should improve the management and portability of sandboxes.

Add completion command supporting bash, zsh, and fish shells. Completes issue NVIDIA#155.

Add backup and restore functionality for sandboxes: - nemoclaw backups - list all backups - nemoclaw <name> export - export sandbox to backup file - nemoclaw import <path> - import sandbox from backup

…dbox-backup # Conflicts: # bin/nemoclaw.js

coderabbitai · 2026-03-20T04:41:21Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

A comprehensive update introducing multiple new subsystems: GitHub issue templates and workflows for documentation/E2E/PR limits, new CLI commands (backup/restore, uninstall, completion), modularized platform/runtime detection utilities, local inference provider support (Ollama, vLLM), non-interactive onboarding mode, sandbox context detection in status reporting, and extensive test coverage. Documentation updated to reflect early preview status and includes new troubleshooting guide.

Changes

Cohort / File(s)	Summary
GitHub Configuration `.github/ISSUE_TEMPLATE/*`, `.github/PULL_REQUEST_TEMPLATE.md`, `.coderabbit.yaml`	Added issue templates (bug report, documentation, feature request), PR template, and CodeRabbit review configuration with documentation-specific editorial rules and Markdown style enforcement.
GitHub Workflows `.github/workflows/docs.yaml`, `.github/workflows/nightly-e2e.yaml`, `.github/workflows/pr-limit.yaml`, `.github/workflows/pr.yaml`	New CI/CD pipelines: documentation build with strict warnings, nightly E2E tests with artifact uploads, PR author limit enforcement, and PR-triggered unit/E2E sandbox tests.
Platform & Runtime Detection `bin/lib/platform.js`, `scripts/lib/runtime.sh`	New modules providing WSL detection, container runtime inference, Docker socket discovery (Colima/Docker Desktop), macOS-specific runtime validation, and local provider health checking.
Inference Configuration `bin/lib/inference-config.js`, `bin/lib/local-inference.js`	New inference routing configuration with support for local providers (Ollama, vLLM) alongside NVIDIA cloud API; includes model discovery, health validation, and provider-specific endpoint resolution.
Backup & Restore `bin/lib/backup.js`, `uninstall.sh`	New sandbox backup/restore functionality with JSON metadata persistence, import/export support, and cleanup of related Docker volumes during uninstall.
Credentials & Utilities `bin/lib/credentials.js`, `bin/lib/resolve-openshell.js`, `bin/lib/runner.js`, `bin/lib/policies.js`, `bin/lib/preflight.js`	Enhanced credential prompting with non-TTY support, openshell executable resolution with fallbacks, new `runInteractive` helper for user-facing commands, policy command builders with sandbox name validation, and port availability checking replacing cgroup validation.
CLI Commands & Onboarding `bin/nemoclaw.js`, `bin/lib/onboard.js`, `nemoclaw/src/commands/onboard.ts`, `nemoclaw/src/onboard/config.ts`	Extended CLI with uninstall, backup/import, completion commands; added non-interactive onboarding mode with env-var driven configuration; local Ollama provider selection in TypeScript; new helper functions for describing endpoint/provider metadata.
Status & Core App `nemoclaw/src/commands/status.ts`, `nemoclaw/src/index.ts`, `nemoclaw/src/commands/slash.ts`	Added sandbox context detection (distinguishing inside vs. outside sandbox), inference status reporting improvements, and provider registration refactoring with managed inference routes.
Installation & Setup `install.sh`, `scripts/install.sh`, `scripts/install-openshell.sh`, `scripts/setup.sh`, `scripts/setup-spark.sh`, `scripts/start-services.sh`	Major refactoring: Git-based installation with local shim creation, runtime detection integration, non-interactive mode support, port availability enforcement, Docker host detection, sandbox name parameterization, and simplified DGX Spark setup flow.
Service Scripts `scripts/nemoclaw-start.sh`, `scripts/telegram-bridge.js`, `scripts/fix-coredns.sh`	Updated model selection via env var, openshell executable resolution in Telegram bridge, and CoreDNS patching using runtime-aware upstream DNS resolution.
Documentation `README.md`, `CONTRIBUTING.md`, `CODE_OF_CONDUCT.md`, `docs/*`, `.agents/skills/update-docs/SKILL.md`, `docs/conf.py`	Updated product messaging to "early preview" (March 16, 2026), expanded quick-start with hardware requirements and container runtime matrix, added uninstall/troubleshooting sections, updated contributing guide with conventional commits and structured development workflow, added new troubleshooting reference page, and Discord community link.
Test Suite - Infrastructure `test/install-preflight.test.js`, `test/onboard-readiness.test.js`, `test/runtime-shell.test.js`, `test/setup-sandbox-name.test.js`, `test/smoke-macos-install.test.js`	Comprehensive bash script testing: installer preflight checks, sandbox readiness/policy command validation, runtime helper functions, shell integration, and macOS smoke install flow.
Test Suite - Node/CLI `test/cli.test.js`, `test/credentials.test.js`, `test/onboard-selection.test.js`, `test/onboard.test.js`, `test/platform.test.js`, `test/policies.test.js`, `test/preflight.test.js`, `test/runner.test.js`, `test/service-env.test.js`, `test/uninstall.test.js`, `nemoclaw/src/commands/status.test.ts`, `nemoclaw/package.json`	Unit tests for new modules (platform detection, local inference, onboard sync scripts) and CLI behaviors (status sandbox detection, onboard provider selection, credential prompts, runner stdio handling); added Vitest configuration.
Test Suite - E2E `test/e2e/Dockerfile.full-e2e`, `test/e2e/test-full-e2e.sh`, `test/e2e/test-double-onboard.sh`	E2E test infrastructure: Docker image for containerized full tests, comprehensive inference journey validation, and double-onboard regression testing for stale-state recovery.
Configuration & Metadata `nemoclaw/tsconfig.json`, `nemoclaw/vitest.config.ts`, `Makefile`	Excluded test files from TypeScript compilation, added Vitest configuration, and introduced `docs-strict` Makefile target.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

docs: refresh doc and start collecting troubleshooting items in a singular place #331: Updates overlapping documentation pages (troubleshooting guide, monitoring, and reference sections) with closely related content and test infrastructure.

Suggested reviewers

jacobtomlinson
ericksoa

Poem

🐰 From bugs to backups, the warren grows wide,
With inference routes and platforms as guide,
Non-interactive dreams and sandboxes clean,
Tests everywhere testing what's never been seen.
Polish and structure, a thorough complete,
NemoClaw's rebirth, so tidy and neat! 🎉

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

NVIDIA#158) * feat(proxy): support plain HTTP forward proxy for private IP endpoints Add forward proxy mode to the sandbox proxy so that standard HTTP libraries (httpx, requests, etc.) work with HTTP_PROXY for plain HTTP calls to private IP endpoints. Previously, non-CONNECT methods were unconditionally rejected with 403. The forward proxy path requires all three conditions to be met: - OPA policy explicitly allows the destination - The matched endpoint has allowed_ips configured - All resolved IPs are RFC 1918 private This ensures plain HTTP never reaches the public internet while enabling seamless access to internal services without custom CONNECT tunnel code. Implementation: - parse_proxy_uri(): parses absolute-form URIs into components - rewrite_forward_request(): rewrites to origin-form, strips hop-by-hop headers, adds Via and Connection: close - handle_forward_proxy(): full handler with OPA eval, SSRF checks, private-IP gate, upstream connect, and bidirectional relay - Updated dispatch in handle_tcp_connection to route non-CONNECT methods Includes 14 unit tests and 6 E2E tests (FWD-1 through FWD-6). CONNECT path remains completely untouched. Closes NVIDIA#155 * fix(proxy): remove InspectForInference match arm removed by NVIDIA#146 The inference routing simplification in NVIDIA#146 reduced NetworkAction to Allow/Deny, removing InspectForInference. Drop the dead match arm from handle_forward_proxy. * fix(sandbox): restore BestEffort as default Landlock compatibility The Landlock V2 upgrade in NVIDIA#151 changed the default from BestEffort to HardRequirement. This causes all proxy-mode sandboxes to crash with Permission denied when the policy omits the landlock field, because the child process gets locked to only /etc/navigator-tls and /sandbox. Restore BestEffort as the default so policies without an explicit landlock field degrade gracefully. Fixes NVIDIA#161 * fix(sandbox): inject baseline filesystem paths for proxy-mode sandboxes Proxy-mode sandboxes need baseline filesystem paths (/usr, /lib, /etc, /app, /var/log read-only; /sandbox, /tmp read-write) for the child process to function under Landlock. Without these, the child can't exec binaries, resolve DNS, or load shared libraries. The supervisor now enriches the policy with these baseline paths at startup, covering both standalone (file) and gateway (gRPC) modes. For gateway mode, the enriched policy is synced back so users see the effective policy via 'nemoclaw sandbox get'. The gateway validation is relaxed to allow additive filesystem changes (new paths can be added, existing paths cannot be removed) to support the supervisor's enrichment sync-back. Includes 2 E2E tests: BFS-1 (missing filesystem_policy) and BFS-2 (incomplete filesystem_policy). Fixes NVIDIA#161 * fix(e2e): update assertion for relaxed filesystem validation message --------- Co-authored-by: John Myers <johntmyers@users.noreply.github.com>

vasanth53 and others added 30 commits March 17, 2026 14:56

feat: add shell completion for nemoclaw CLI

bdf50b8

Add completion command supporting bash, zsh, and fish shells. Completes issue NVIDIA#155.

feat: add sandbox export/import backup commands

ad1af8b

Add backup and restore functionality for sandboxes: - nemoclaw backups - list all backups - nemoclaw <name> export - export sandbox to backup file - nemoclaw import <path> - import sandbox from backup

fix(docs): render MD note block properly (NVIDIA#113)

60e8cd0

Fixes the README.md's note block to render properly in github markdown.

chore: update contributing, small doc fixes, rename doc update skill (N…

9673eaa

…VIDIA#242) * update contributing and rename doc update skill * rewrite intro * add filler sentences * add CoC, small fix in README for docs * fix rendering of note * add a note

Add minimal coderabbit config (NVIDIA#234)

2a9afbc

docs: add hardware requirement (NVIDIA#256)

a28d4b3

minor doc updates (NVIDIA#252)

3df5589

fix: remove cgroup v2 preflight check (NVIDIA#248)

0d1d2d8

The cgroup v2 / cgroupns=host issue has been fixed on the OpenShell side, so NemoClaw no longer needs to check for it during onboarding. Removes bin/lib/preflight.js, its tests, and the check in onboard.js.

docs: document long-output CLI workaround (NVIDIA#327)

241ffb2

docs: refresh doc and start collecting troubleshooting items in a sin…

417b4bf

…gular place (NVIDIA#331) * doc refresh and start troubleshooting page * refresh doc and start collecting troubleshooting * openshell command fix

ci: add docs build workflow for PRs (NVIDIA#334)

2b5febe

update installer url for curl (NVIDIA#346)

5dbf8bb

Create bug_report.yml (NVIDIA#371)

39112f9

* Create bug_report.yml * Update bug_report.yml

wscurran and others added 19 commits March 18, 2026 17:54

Create feature_request.yml (NVIDIA#372)

9513eca

Reattach curl installer onboarding to tty (NVIDIA#377)

f3430c6

* reattach curl installer onboarding to tty * update installer tests for non-interactive tty guard * open tty explicitly for piped onboarding * run stdin installer test in non-interactive mode

docs: fix the ref link to spark instruction, minor style fixes, add R…

20a63d0

…EADME to docs workflow (NVIDIA#392) * fix link and add README to docs workflow * workaround for rendering correctly in both raw md and myst

chore: add coderabbit instructions for docs (NVIDIA#421)

6cca75e

* add coderabbit instructions for docs * coderabbit suggestion

docs: add discord channel to docs site (NVIDIA#416)

60992be

* add discord channel to docs * shorter preview text * add discord link to more places * resolve warning * use invite link * update url * nit

docs: add uninstall section to quickstart in README (NVIDIA#423)

12fc0c0

* add uninstall section to quickstart * add link to the script * remove $ from code blocks and explain which terminal in text

docs move uninstall command to readme uninstall section (NVIDIA#435)

6ca7d37

docs add uninstall curl flags example to readme (NVIDIA#440)

b3ae862

docs: update messaging (NVIDIA#432)

3e7eb71

* update messaging * bring the intro sentence back

docs: remove duplicated content in README (NVIDIA#443)

16fb337

* docs: remove duplicated content in README Rephrase note about using CLI for long responses. * rm link

chore: add pr template (NVIDIA#449)

be92f71

* add pr template * rm redundant item * improve

chore: add issue templates and improve PR template (NVIDIA#451)

dbfd78c

* improve * add issue templates and improve pr template

wscurran added enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). labels Mar 19, 2026

wscurran added the priority: medium Issue that should be addressed in upcoming releases label Mar 19, 2026

vasanth53 added 3 commits March 20, 2026 10:10

feat: add shell completion for nemoclaw CLI

c8b7fc3

Add completion command supporting bash, zsh, and fish shells. Completes issue NVIDIA#155.

feat: add sandbox export/import backup commands

c88aea8

Add backup and restore functionality for sandboxes: - nemoclaw backups - list all backups - nemoclaw <name> export - export sandbox to backup file - nemoclaw import <path> - import sandbox from backup

Merge remote-tracking branch 'fork/feat/sandbox-backup' into feat/san…

4e34e5c

…dbox-backup # Conflicts: # bin/nemoclaw.js

vasanth53 closed this Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add sandbox export/import backup commands#161

feat: add sandbox export/import backup commands#161
vasanth53 wants to merge 52 commits intoNVIDIA:mainfrom
vasanth53:feat/sandbox-backup

vasanth53 commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

wscurran commented Mar 19, 2026

Uh oh!

coderabbitai bot commented Mar 20, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Conversation

vasanth53 commented Mar 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New Commands

List Backups

Export Sandbox

Import Sandbox

Features

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

wscurran commented Mar 19, 2026

Uh oh!

coderabbitai bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

vasanth53 commented Mar 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 20, 2026 •

edited

Loading