Skip to content

docs: add host-side config and state file locations to README#903

Merged
cv merged 1 commit intoNVIDIA:mainfrom
WuKongAI-CMU:docs/494-config-locations
Mar 26, 2026
Merged

docs: add host-side config and state file locations to README#903
cv merged 1 commit intoNVIDIA:mainfrom
WuKongAI-CMU:docs/494-config-locations

Conversation

@WuKongAI-CMU
Copy link
Copy Markdown
Contributor

@WuKongAI-CMU WuKongAI-CMU commented Mar 25, 2026

Summary

  • add a README section that points users to the main host-side NemoClaw state files
  • call out the optional service environment variables users are most likely to need
  • steer reconfiguration back toward nemoclaw onboard instead of manual edits

Testing

  • not run (docs-only change)

Fixes #494

Summary by CodeRabbit

  • Documentation
    • Added "Host-Side State and Config" section explaining where NemoClaw stores operator configuration and state files on the host system.
    • Documented common environment variables for optional services and configuration.
    • Clarified recommended setup and reconfiguration procedures using the onboard command.

Signed-off-by: peteryuqin <peter.yuqin@gmail.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

Adds a new documentation section to README.md explaining host-side state and configuration for NemoClaw. Documents three host file paths (~/.nemoclaw/credentials.json, ~/.nemoclaw/sandboxes.json, ~/.openclaw/openclaw.json), common environment variables, and recommends using nemoclaw onboard for setup rather than manual file editing.

Changes

Cohort / File(s) Summary
Documentation
README.md
Added "Host-Side State and Config" section documenting operator-facing state storage locations, host file paths with purposes, common environment variables, and configuration workflow recommendations.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰✨ A fluffy README update hops into view,
Config files documented, shiny and new!
No tangled secrets in sandbox walls—
Just home directory files that answer all calls!
Onboard with ease, the docs now say,
For operators, configuration's just a hop away! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR addresses issue #494 but incompletely: it documents file locations and environment variables, but omits the JSON example snippet and port configuration instructions from the original requirements. Add JSON example showing how to configure channels.signal.httpPort and instructions for creating the file if it doesn't exist, as specified in issue #494.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately describes the main change: adding documentation about host-side configuration and state file locations to the README.
Out of Scope Changes check ✅ Passed All changes are directly related to documenting host-side configuration and state file locations as required by issue #494; no unrelated changes detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
README.md (1)

205-218: Consider adding practical examples and workflow guidance.

The section successfully documents file locations, but issue #494 requested additional details that would help users actually work with these files:

  1. JSON example: Show a snippet of ~/.openclaw/openclaw.json structure with a practical example (e.g., setting channels.signal.httpPort to 9090)
  2. File creation: Note whether users need to create these files manually or if they're auto-generated
  3. Environment variables: Explain where to set the listed env vars (shell profile, systemd service, etc.)
  4. Post-change workflow: Clarify what happens after manual edits—does the user need to restart services, re-run nemoclaw onboard, or something else?

The current guidance "prefer nemoclaw onboard over editing these files by hand" is good advice, but users who do need manual edits would benefit from understanding the structure and consequences.

📝 Example enhancement

Consider adding after line 214:

Most of these files are created automatically during `nemoclaw onboard`. If you need to manually edit `~/.openclaw/openclaw.json`, it follows this structure:

\```json
{
  "channels": {
    "signal": {
      "httpPort": 9090
    }
  }
}
\```

After manual configuration changes, restart the relevant NemoClaw services with `nemoclaw stop` and `nemoclaw start`.

For environment variables, you might add after line 216:

Set these in your shell profile (`~/.bashrc` or `~/.zshrc`) or pass them when launching services:

\```bash
export TELEGRAM_BOT_TOKEN="your-token"
export ALLOWED_CHAT_IDS="123456,789012"
\```
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 205 - 218, Add practical examples and workflow
guidance for the host-side files: include a short JSON example showing the
structure of ~/.openclaw/openclaw.json (e.g., channels.signal.httpPort set to
9090), state whether ~/.nemoclaw/credentials.json and ~/.nemoclaw/sandboxes.json
and ~/.openclaw/openclaw.json are auto-created by the onboarding flow or must be
created manually (mention that most are created by running the comando 'nemoclaw
onboard'), document where to set TELEGRAM_BOT_TOKEN, ALLOWED_CHAT_IDS and
CHAT_UI_URL (shell profiles like ~/.bashrc, ~/.zshrc or systemd unit
environment) and add a clear post-edit workflow (e.g., after manual edits run
'nemoclaw stop' and 'nemoclaw start' or re-run 'nemoclaw onboard' as
appropriate) so users know the required restart/reload steps.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@README.md`:
- Around line 205-218: Add practical examples and workflow guidance for the
host-side files: include a short JSON example showing the structure of
~/.openclaw/openclaw.json (e.g., channels.signal.httpPort set to 9090), state
whether ~/.nemoclaw/credentials.json and ~/.nemoclaw/sandboxes.json and
~/.openclaw/openclaw.json are auto-created by the onboarding flow or must be
created manually (mention that most are created by running the comando 'nemoclaw
onboard'), document where to set TELEGRAM_BOT_TOKEN, ALLOWED_CHAT_IDS and
CHAT_UI_URL (shell profiles like ~/.bashrc, ~/.zshrc or systemd unit
environment) and add a clear post-edit workflow (e.g., after manual edits run
'nemoclaw stop' and 'nemoclaw start' or re-run 'nemoclaw onboard' as
appropriate) so users know the required restart/reload steps.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7dc4e0c4-2118-4548-b132-d9c9622b4412

📥 Commits

Reviewing files that changed from the base of the PR and between 0ef5dd2 and fb4cb2a.

📒 Files selected for processing (1)
  • README.md

@WuKongAI-CMU
Copy link
Copy Markdown
Contributor Author

This is still approved on my side and I confirmed there are no follow-up changes pending from me. It is currently blocked by the base-branch merge policy for non-maintainers, so it looks ready for maintainer merge when convenient.

@WuKongAI-CMU
Copy link
Copy Markdown
Contributor Author

Maintainer follow-up: this branch is still current, remains mergeable, and the host-side config/state path documentation is already in shape for merge. If there is any remaining policy or wording concern, I can patch it quickly.

@cv cv merged commit 7970b7a into NVIDIA:main Mar 26, 2026
11 of 12 checks passed
TSavo pushed a commit to wopr-network/nemoclaw that referenced this pull request Mar 27, 2026
* fix: improve gateway lifecycle recovery (NVIDIA#953)

* fix: improve gateway lifecycle recovery

* docs: fix readme markdown list spacing

* fix: tighten gateway lifecycle review follow-ups

* fix: simplify tokenized control ui output

* fix: restore chat route in control ui urls

* refactor: simplify ansi stripping in onboard

* fix: shorten control ui url output

* fix: move control ui below cli next steps

* fix: swap hard/soft ulimit settings in start script (NVIDIA#951)

Fixes NVIDIA#949

Co-authored-by: KJ <kejones@nvidia.com>

* chore: add cyclomatic complexity lint rule (NVIDIA#875)

* chore: add cyclomatic complexity rule (ratchet from 95)

Add ESLint complexity rule to bin/ and scripts/ to prevent new
functions from accumulating excessive branching. Starting threshold
is 95 (current worst offender: setupNim in onboard.js). Ratchet
plan: 95 → 40 → 25 → 15.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: ratchet complexity to 20, suppress existing violations

Suppress 6 functions that exceed the threshold with eslint-disable
comments so we can start enforcing at 20 instead of 95:

- setupNim (95), setupPolicies (41), setupInference (22) in onboard.js
- deploy (22), main IIFE (27) in nemoclaw.js
- applyPreset (24) in policies.js

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: suppress complexity for 3 missed functions

preflight (23), getReconciledSandboxGatewayState (25), sandboxStatus (27)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add host-side config and state file locations to README (NVIDIA#903)

Signed-off-by: peteryuqin <peter.yuqin@gmail.com>

* chore: add tsconfig.cli.json, root execa, TS coverage ratchet (NVIDIA#913)

* chore: add tsconfig.cli.json, root execa, TS coverage ratchet

Foundation for the CLI TypeScript migration (PR 0 of the shell
consolidation plan). No runtime changes — config, tooling, and
dependency only.

- tsconfig.cli.json: strict TS type-checking for bin/ and scripts/
  (noEmit, module: preserve — tsx handles the runtime)
- scripts/check-coverage-ratchet.ts: pure TS replacement for the
  bash+python coverage ratchet script (same logic, same tolerance)
- execa ^9.6.1 added to root devDependencies (used by PR 1+)
- pr.yaml: coverage ratchet step now runs the TS version via tsx
- .pre-commit-config.yaml: SPDX headers cover scripts/*.ts,
  new tsc-check-cli pre-push hook
- CONTRIBUTING.md: document typecheck:cli task and CLI pre-push hook
- Delete scripts/check-coverage-ratchet.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Apply suggestion from @brandonpelfrey

* chore: address PR feedback — use types_or, add tsx devDep

- Use `types_or: [ts, tsx]` instead of file glob for tsc-check-cli
  hook per @brandonpelfrey's suggestion.
- Add `tsx` to devDependencies so CI doesn't re-fetch it on every run
  per CodeRabbit's suggestion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): ignore GitHub "Apply suggestion" commits in commitlint

* fix(ci): lint only PR title since repo is squash-merge only

Reverts the commitlint ignores rule from the previous commit and
instead removes the per-commit lint step entirely.

Individual commit messages are discarded at merge time — only the
squash-merged PR title lands in main and drives changelog generation.
Drop the per-commit lint, keep the PR title check, and remove the
now-unnecessary fetch-depth: 0.

* Revert "fix(ci): lint only PR title since repo is squash-merge only"

This reverts commit 1257a47.

* Revert "fix(ci): ignore GitHub "Apply suggestion" commits in commitlint"

This reverts commit c395657.

* docs: fix markdownlint MD032 in README (blank line before list)

* refactor: make coverage ratchet script idiomatic TypeScript

- Wrap in main() with process.exitCode instead of scattered process.exit()
- Replace mutable flags with .map()/.some() over typed MetricResult[]
- Separate pure logic (checkMetrics) from formatting (formatReport)
- Throw with { cause } chaining instead of exit-in-helpers
- Derive CoverageThresholds from METRICS tuple (single source of truth)
- Exhaustive switch on CheckStatus discriminated union

* refactor: remove duplication in coverage ratchet script

- Drop STATUS_LABELS map; inline labels in exhaustive switch
- Extract common 'metric coverage is N%' preamble in formatResult
- Simplify ratchetedThresholds: use results directly (already in
  METRICS order) instead of re-scanning with .find() per metric
- Compute 'failed' once in main, pass into formatReport to avoid
  duplicate .some() scan

* refactor: simplify coverage ratchet with FP patterns

- Extract classify() as a named pure function (replaces nested ternary)
- loadJSON takes repo-relative paths, eliminating THRESHOLD_PATH and
  SUMMARY_PATH constants (DRY the join-with-REPO_ROOT pattern)
- Drop CoverageMetric/CoverageSummary interfaces (only pct is read);
  use structural type at the call site instead
- Inline ratchetedThresholds (one-liner, used once)
- formatReport derives fail/improved from results instead of taking
  a pre-computed boolean (let functions derive from data, don't
  thread derived state)
- sections.join("\n\n") replaces manual empty-string pushing
- Shorter type names (Thresholds, Status, Result) — no ambiguity
  in a single-purpose script

* refactor: strip coverage ratchet to failure-only output

prek hides output from commands that exit 0, so ok/improved
reporting was dead code. Remove Status, Result, classify,
formatResult, formatReport, and the ratcheted-thresholds
suggestion block. The script now just filters for regressions
and prints actionable errors on failure.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Brandon Pelfrey <bpelfrey@nvidia.com>

* fix: use CONNECT tunnel for WebSocket endpoints in Discord/Slack presets (NVIDIA#438)

* fix: use CONNECT tunnel for WebSocket endpoints in Discord/Slack presets

The egress proxy's HTTP idle timeout (~2 min) kills long-lived WebSocket
connections when endpoints are configured with protocol:rest + tls:terminate.
Switch WebSocket endpoints to access:full (CONNECT tunnel) which bypasses
HTTP-level timeouts entirely.

Discord:
- gateway.discord.gg → access:full (WebSocket gateway)
- Add PUT/PATCH/DELETE methods for discord.com (message editing, reactions)
- Add media.discordapp.net for attachment access

Slack:
- Add wss-primary.slack.com and wss-backup.slack.com → access:full
  (Socket Mode WebSocket endpoints)

Partially addresses NVIDIA#409 — the policy-level fix enables WebSocket
connections to survive. The hardcoded 2-min timeout in openshell-sandbox
still affects any protocol:rest endpoints with long-lived connections.

Related: NVIDIA#361 (WhatsApp Web, same root cause)

* fix: correct comment wording for media endpoint and YAML formatting

* fix: standardize Node.js minimum version to 22.16 (NVIDIA#840)

* fix: remove unused RECOMMENDED_NODE_MAJOR from scripts/install.sh

Shellcheck flagged it as unused after the min/recommended merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: enforce full semver >=22.16.0 in installer scripts

The runtime checks only compared the major Node.js version, allowing
22.0–22.15 to pass despite package.json requiring >=22.16.0. Use the
version_gte() helper for full semver comparison in both installers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden version_gte and align fallback message

Guard version_gte() against prerelease suffixes (e.g. "22.16.0-rc.1")
that would crash bash arithmetic. Also update the manual-install
fallback message to reference MIN_NODE_VERSION instead of hardcoded "22".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update test stubs for Node.js 22.16 minimum and add Node 20 rejection test

- Bump node stub in 'succeeds with acceptable Node.js' from v20.0.0 to v22.16.0
- Bump node stub in buildCurlPipeEnv from v22.14.0 to v22.16.0
- Add new test asserting Node.js 20 is rejected by ensure_supported_runtime

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Signed-off-by: peteryuqin <peter.yuqin@gmail.com>
Co-authored-by: KJ <kejones@nvidia.com>
Co-authored-by: Emily Wilkins <80470879+epwilkins@users.noreply.github.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Peter <peter.yuqin@gmail.com>
Co-authored-by: Brandon Pelfrey <bpelfrey@nvidia.com>
Co-authored-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
TSavo pushed a commit to wopr-network/nemoclaw that referenced this pull request Mar 28, 2026
* fix: improve gateway lifecycle recovery (NVIDIA#953)

* fix: improve gateway lifecycle recovery

* docs: fix readme markdown list spacing

* fix: tighten gateway lifecycle review follow-ups

* fix: simplify tokenized control ui output

* fix: restore chat route in control ui urls

* refactor: simplify ansi stripping in onboard

* fix: shorten control ui url output

* fix: move control ui below cli next steps

* fix: swap hard/soft ulimit settings in start script (NVIDIA#951)

Fixes NVIDIA#949

Co-authored-by: KJ <kejones@nvidia.com>

* chore: add cyclomatic complexity lint rule (NVIDIA#875)

* chore: add cyclomatic complexity rule (ratchet from 95)

Add ESLint complexity rule to bin/ and scripts/ to prevent new
functions from accumulating excessive branching. Starting threshold
is 95 (current worst offender: setupNim in onboard.js). Ratchet
plan: 95 → 40 → 25 → 15.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: ratchet complexity to 20, suppress existing violations

Suppress 6 functions that exceed the threshold with eslint-disable
comments so we can start enforcing at 20 instead of 95:

- setupNim (95), setupPolicies (41), setupInference (22) in onboard.js
- deploy (22), main IIFE (27) in nemoclaw.js
- applyPreset (24) in policies.js

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: suppress complexity for 3 missed functions

preflight (23), getReconciledSandboxGatewayState (25), sandboxStatus (27)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add host-side config and state file locations to README (NVIDIA#903)

Signed-off-by: peteryuqin <peter.yuqin@gmail.com>

* chore: add tsconfig.cli.json, root execa, TS coverage ratchet (NVIDIA#913)

* chore: add tsconfig.cli.json, root execa, TS coverage ratchet

Foundation for the CLI TypeScript migration (PR 0 of the shell
consolidation plan). No runtime changes — config, tooling, and
dependency only.

- tsconfig.cli.json: strict TS type-checking for bin/ and scripts/
  (noEmit, module: preserve — tsx handles the runtime)
- scripts/check-coverage-ratchet.ts: pure TS replacement for the
  bash+python coverage ratchet script (same logic, same tolerance)
- execa ^9.6.1 added to root devDependencies (used by PR 1+)
- pr.yaml: coverage ratchet step now runs the TS version via tsx
- .pre-commit-config.yaml: SPDX headers cover scripts/*.ts,
  new tsc-check-cli pre-push hook
- CONTRIBUTING.md: document typecheck:cli task and CLI pre-push hook
- Delete scripts/check-coverage-ratchet.sh

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Apply suggestion from @brandonpelfrey

* chore: address PR feedback — use types_or, add tsx devDep

- Use `types_or: [ts, tsx]` instead of file glob for tsc-check-cli
  hook per @brandonpelfrey's suggestion.
- Add `tsx` to devDependencies so CI doesn't re-fetch it on every run
  per CodeRabbit's suggestion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(ci): ignore GitHub "Apply suggestion" commits in commitlint

* fix(ci): lint only PR title since repo is squash-merge only

Reverts the commitlint ignores rule from the previous commit and
instead removes the per-commit lint step entirely.

Individual commit messages are discarded at merge time — only the
squash-merged PR title lands in main and drives changelog generation.
Drop the per-commit lint, keep the PR title check, and remove the
now-unnecessary fetch-depth: 0.

* Revert "fix(ci): lint only PR title since repo is squash-merge only"

This reverts commit 1257a47.

* Revert "fix(ci): ignore GitHub "Apply suggestion" commits in commitlint"

This reverts commit c395657.

* docs: fix markdownlint MD032 in README (blank line before list)

* refactor: make coverage ratchet script idiomatic TypeScript

- Wrap in main() with process.exitCode instead of scattered process.exit()
- Replace mutable flags with .map()/.some() over typed MetricResult[]
- Separate pure logic (checkMetrics) from formatting (formatReport)
- Throw with { cause } chaining instead of exit-in-helpers
- Derive CoverageThresholds from METRICS tuple (single source of truth)
- Exhaustive switch on CheckStatus discriminated union

* refactor: remove duplication in coverage ratchet script

- Drop STATUS_LABELS map; inline labels in exhaustive switch
- Extract common 'metric coverage is N%' preamble in formatResult
- Simplify ratchetedThresholds: use results directly (already in
  METRICS order) instead of re-scanning with .find() per metric
- Compute 'failed' once in main, pass into formatReport to avoid
  duplicate .some() scan

* refactor: simplify coverage ratchet with FP patterns

- Extract classify() as a named pure function (replaces nested ternary)
- loadJSON takes repo-relative paths, eliminating THRESHOLD_PATH and
  SUMMARY_PATH constants (DRY the join-with-REPO_ROOT pattern)
- Drop CoverageMetric/CoverageSummary interfaces (only pct is read);
  use structural type at the call site instead
- Inline ratchetedThresholds (one-liner, used once)
- formatReport derives fail/improved from results instead of taking
  a pre-computed boolean (let functions derive from data, don't
  thread derived state)
- sections.join("\n\n") replaces manual empty-string pushing
- Shorter type names (Thresholds, Status, Result) — no ambiguity
  in a single-purpose script

* refactor: strip coverage ratchet to failure-only output

prek hides output from commands that exit 0, so ok/improved
reporting was dead code. Remove Status, Result, classify,
formatResult, formatReport, and the ratcheted-thresholds
suggestion block. The script now just filters for regressions
and prints actionable errors on failure.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Brandon Pelfrey <bpelfrey@nvidia.com>

* fix: use CONNECT tunnel for WebSocket endpoints in Discord/Slack presets (NVIDIA#438)

* fix: use CONNECT tunnel for WebSocket endpoints in Discord/Slack presets

The egress proxy's HTTP idle timeout (~2 min) kills long-lived WebSocket
connections when endpoints are configured with protocol:rest + tls:terminate.
Switch WebSocket endpoints to access:full (CONNECT tunnel) which bypasses
HTTP-level timeouts entirely.

Discord:
- gateway.discord.gg → access:full (WebSocket gateway)
- Add PUT/PATCH/DELETE methods for discord.com (message editing, reactions)
- Add media.discordapp.net for attachment access

Slack:
- Add wss-primary.slack.com and wss-backup.slack.com → access:full
  (Socket Mode WebSocket endpoints)

Partially addresses NVIDIA#409 — the policy-level fix enables WebSocket
connections to survive. The hardcoded 2-min timeout in openshell-sandbox
still affects any protocol:rest endpoints with long-lived connections.

Related: NVIDIA#361 (WhatsApp Web, same root cause)

* fix: correct comment wording for media endpoint and YAML formatting

* fix: standardize Node.js minimum version to 22.16 (NVIDIA#840)

* fix: remove unused RECOMMENDED_NODE_MAJOR from scripts/install.sh

Shellcheck flagged it as unused after the min/recommended merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: enforce full semver >=22.16.0 in installer scripts

The runtime checks only compared the major Node.js version, allowing
22.0–22.15 to pass despite package.json requiring >=22.16.0. Use the
version_gte() helper for full semver comparison in both installers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden version_gte and align fallback message

Guard version_gte() against prerelease suffixes (e.g. "22.16.0-rc.1")
that would crash bash arithmetic. Also update the manual-install
fallback message to reference MIN_NODE_VERSION instead of hardcoded "22".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update test stubs for Node.js 22.16 minimum and add Node 20 rejection test

- Bump node stub in 'succeeds with acceptable Node.js' from v20.0.0 to v22.16.0
- Bump node stub in buildCurlPipeEnv from v22.14.0 to v22.16.0
- Add new test asserting Node.js 20 is rejected by ensure_supported_runtime

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden installer and onboard resiliency (NVIDIA#961)

* fix: harden installer and onboard resiliency

* fix: address installer and debug review follow-ups

* fix: harden onboard resume across later setup steps

* test: simplify payload extraction in onboard tests

* test: keep onboard payload extraction target-compatible

* chore: align onboard session lint with complexity rule

* fix: harden onboard session safety and lock handling

* fix: tighten onboard session redaction and metadata handling

* fix(security): strip credentials from migration snapshots and enforce blueprint digest (NVIDIA#769)

Reconciles NVIDIA#156 and NVIDIA#743 into a single comprehensive solution:

- Filter auth-profiles.json at copy time via cpSync filter (from NVIDIA#743)
- Recursive stripCredentials() with pattern-based field detection for
  deep config sanitization (from NVIDIA#156: CREDENTIAL_FIELDS set +
  CREDENTIAL_FIELD_PATTERN regex)
- Remove gateway config section (contains auth tokens) from sandbox
  openclaw.json
- Blueprint digest verification (SHA-256): recorded at snapshot time,
  validated on restore, empty/missing digest is a hard failure
- computeFileDigest() throws when blueprint file is missing instead of
  silently returning null
- Sanitize both snapshot-level and sandbox-bundle openclaw.json copies
- Backward compatible: old snapshots without blueprintDigest skip
  validation
- Bump SNAPSHOT_VERSION 2 → 3

Supersedes NVIDIA#156 and NVIDIA#743.

* fix(sandbox): export proxy env vars with full NO_PROXY and persist across reconnects (NVIDIA#1025)

* fix(sandbox): export proxy env vars with full NO_PROXY and persist across reconnects

OpenShell injects NO_PROXY=127.0.0.1,localhost,::1 into the sandbox, missing
inference.local and the gateway IP (10.200.0.1). This causes LLM inference
requests to route through the egress proxy instead of going direct, and the
proxy gateway IP itself gets proxied.

Add proxy configuration block to nemoclaw-start.sh that:
- Exports HTTP_PROXY, HTTPS_PROXY, and NO_PROXY with inference.local and
  the gateway IP included
- Persists via /etc/profile.d/nemoclaw-proxy.sh (root) or ~/.profile
  (non-root fallback) so values survive OpenShell reconnect injection
- Supports NEMOCLAW_PROXY_HOST / NEMOCLAW_PROXY_PORT overrides

The non-root fallback ensures the fix works in environments like Brev where
containers run without root privileges.

Tested on DGX Spark (ARM64) and Brev VM (x86_64). Verified NO_PROXY contains
inference.local and 10.200.0.1 inside the live sandbox after connect.

Ref: NVIDIA#626, NVIDIA#704
Ref: NVIDIA#704 (comment)

* fix(sandbox): write proxy config to ~/.bashrc for interactive reconnect sessions

OpenShell's `sandbox connect` spawns `/bin/bash -i` (interactive, non-login),
which sources ~/.bashrc — not ~/.profile or /etc/profile.d/*.  The previous
approach wrote to ~/.profile and /etc/profile.d/, neither of which is sourced
by `bash -i`, so the narrow OpenShell-injected NO_PROXY persisted in live
interactive sessions.

Changes:
- Write proxy snippet to ~/.bashrc (primary) and ~/.profile (login fallback)
- Export both uppercase and lowercase proxy variants (NO_PROXY + no_proxy,
  HTTP_PROXY + http_proxy, etc.) — Node.js undici prefers lowercase no_proxy
  over uppercase NO_PROXY when both are set
- Add idempotency guard to prevent duplicate blocks on container restart
- Update tests: verify .bashrc writing, idempotency, bash -i override
  behavior, and lowercase variant correctness

Tested on DGX Spark (ARM64) and Brev VM (x86_64) with full destroy +
re-onboard + live `env | grep proxy` verification inside the sandbox shell
via `openshell sandbox connect`.

Ref: NVIDIA#626

* fix(sandbox): replace stale proxy values on restart with begin/end markers

Use begin/end markers in .bashrc/.profile proxy snippet so
_write_proxy_snippet replaces the block when PROXY_HOST/PORT change
instead of silently keeping stale values. Adds test coverage for the
replacement path.

Addresses CodeRabbit review feedback on idempotency gap.

* fix(sandbox): resolve sandbox user home dynamically when running as root

When the entrypoint runs as root, $HOME is /root — the proxy snippet
was written to /root/.bashrc instead of the sandbox user's home.
Use getent passwd to look up the sandbox user's home when running as
UID 0; fall back to /sandbox if the user entry is missing.

Addresses CodeRabbit review feedback on _SANDBOX_HOME resolution.

---------

Co-authored-by: Carlos Villela <cvillela@nvidia.com>

* fix(policies): preset application for versionless policies (Fixes NVIDIA#35) (NVIDIA#101)

* fix(policies): allow preset application for versionless policies (Fixes NVIDIA#35)

Fixes NVIDIA#35

Signed-off-by: Deepak Jain <deepujain@gmail.com>

* fix: remove stale complexity suppression in policies

---------

Signed-off-by: Deepak Jain <deepujain@gmail.com>
Co-authored-by: Kevin Jones <kejones@nvidia.com>

* fix: restore routed inference and connect UX (NVIDIA#1037)

* fix: restore routed inference and connect UX

* fix: simplify detected local inference hint

* fix: remove stale local inference hint

* test: relax connect forward assertion

---------

Signed-off-by: peteryuqin <peter.yuqin@gmail.com>
Signed-off-by: Deepak Jain <deepujain@gmail.com>
Co-authored-by: KJ <kejones@nvidia.com>
Co-authored-by: Emily Wilkins <80470879+epwilkins@users.noreply.github.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Peter <peter.yuqin@gmail.com>
Co-authored-by: Brandon Pelfrey <bpelfrey@nvidia.com>
Co-authored-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com>
Co-authored-by: Lucas Wang <lucas_wang@lucas-futures.com>
Co-authored-by: senthilr-nv <senthilr@nvidia.com>
Co-authored-by: Deepak Jain <deepujain@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
laitingsheng pushed a commit that referenced this pull request Apr 2, 2026
Signed-off-by: peteryuqin <peter.yuqin@gmail.com>
lakamsani pushed a commit to lakamsani/NemoClaw that referenced this pull request Apr 4, 2026
…#903)

Signed-off-by: peteryuqin <peter.yuqin@gmail.com>
gemini2026 pushed a commit to gemini2026/NemoClaw that referenced this pull request Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add config / environmentals to README.md

2 participants