fix: restore routed inference and connect UX#1037
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughRemoved the automatic background OpenShell port-forward before sandbox connect and removed Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/cli.test.js (1)
205-205: Broaden the negative assertion to avoid sandbox-name coupling.The current check only rejects
forward start --background 18789 alpha. Consider rejecting anyforward start --background 18789invocation so the test still fails if argument ordering/sandbox resolution changes.Suggested tweak
- expect(calls).not.toContain("forward start --background 18789 alpha"); + expect(calls).not.toContain("forward start --background 18789");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/cli.test.js` at line 205, Replace the exact-string negative assertion so the test rejects any invocation beginning with the forward command and port. Instead of expect(calls).not.toContain("forward start --background 18789 alpha"), assert that no entry in the calls array starts with "forward start --background 18789" (for example: expect(calls.some(c => c.startsWith("forward start --background 18789"))).toBe(false)), targeting the existing calls variable to avoid coupling to the sandbox name or argument ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/cli.test.js`:
- Line 205: Replace the exact-string negative assertion so the test rejects any
invocation beginning with the forward command and port. Instead of
expect(calls).not.toContain("forward start --background 18789 alpha"), assert
that no entry in the calls array starts with "forward start --background 18789"
(for example: expect(calls.some(c => c.startsWith("forward start --background
18789"))).toBe(false)), targeting the existing calls variable to avoid coupling
to the sandbox name or argument ordering.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: dd6d8b9a-1560-4038-aac8-3e5670a4d049
📒 Files selected for processing (4)
bin/nemoclaw.jsscripts/nemoclaw-start.shtest/cli.test.jstest/service-env.test.js
💤 Files with no reviewable changes (1)
- bin/nemoclaw.js
ericksoa
left a comment
There was a problem hiding this comment.
Reviewed for regressions. NO_PROXY fix is correct and well-tested — inference.local should route through the OpenShell proxy, not bypass it. Port forward removal and prompt cleanup look clean. One note: confirm OpenShell's sandbox connect reliably re-establishes port 18789 forwarding on reconnect (not just first connect). Otherwise, ship it.
* fix: improve gateway lifecycle recovery (NVIDIA#953) * fix: improve gateway lifecycle recovery * docs: fix readme markdown list spacing * fix: tighten gateway lifecycle review follow-ups * fix: simplify tokenized control ui output * fix: restore chat route in control ui urls * refactor: simplify ansi stripping in onboard * fix: shorten control ui url output * fix: move control ui below cli next steps * fix: swap hard/soft ulimit settings in start script (NVIDIA#951) Fixes NVIDIA#949 Co-authored-by: KJ <kejones@nvidia.com> * chore: add cyclomatic complexity lint rule (NVIDIA#875) * chore: add cyclomatic complexity rule (ratchet from 95) Add ESLint complexity rule to bin/ and scripts/ to prevent new functions from accumulating excessive branching. Starting threshold is 95 (current worst offender: setupNim in onboard.js). Ratchet plan: 95 → 40 → 25 → 15. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: ratchet complexity to 20, suppress existing violations Suppress 6 functions that exceed the threshold with eslint-disable comments so we can start enforcing at 20 instead of 95: - setupNim (95), setupPolicies (41), setupInference (22) in onboard.js - deploy (22), main IIFE (27) in nemoclaw.js - applyPreset (24) in policies.js Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: suppress complexity for 3 missed functions preflight (23), getReconciledSandboxGatewayState (25), sandboxStatus (27) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add host-side config and state file locations to README (NVIDIA#903) Signed-off-by: peteryuqin <peter.yuqin@gmail.com> * chore: add tsconfig.cli.json, root execa, TS coverage ratchet (NVIDIA#913) * chore: add tsconfig.cli.json, root execa, TS coverage ratchet Foundation for the CLI TypeScript migration (PR 0 of the shell consolidation plan). No runtime changes — config, tooling, and dependency only. - tsconfig.cli.json: strict TS type-checking for bin/ and scripts/ (noEmit, module: preserve — tsx handles the runtime) - scripts/check-coverage-ratchet.ts: pure TS replacement for the bash+python coverage ratchet script (same logic, same tolerance) - execa ^9.6.1 added to root devDependencies (used by PR 1+) - pr.yaml: coverage ratchet step now runs the TS version via tsx - .pre-commit-config.yaml: SPDX headers cover scripts/*.ts, new tsc-check-cli pre-push hook - CONTRIBUTING.md: document typecheck:cli task and CLI pre-push hook - Delete scripts/check-coverage-ratchet.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Apply suggestion from @brandonpelfrey * chore: address PR feedback — use types_or, add tsx devDep - Use `types_or: [ts, tsx]` instead of file glob for tsc-check-cli hook per @brandonpelfrey's suggestion. - Add `tsx` to devDependencies so CI doesn't re-fetch it on every run per CodeRabbit's suggestion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(ci): ignore GitHub "Apply suggestion" commits in commitlint * fix(ci): lint only PR title since repo is squash-merge only Reverts the commitlint ignores rule from the previous commit and instead removes the per-commit lint step entirely. Individual commit messages are discarded at merge time — only the squash-merged PR title lands in main and drives changelog generation. Drop the per-commit lint, keep the PR title check, and remove the now-unnecessary fetch-depth: 0. * Revert "fix(ci): lint only PR title since repo is squash-merge only" This reverts commit 1257a47. * Revert "fix(ci): ignore GitHub "Apply suggestion" commits in commitlint" This reverts commit c395657. * docs: fix markdownlint MD032 in README (blank line before list) * refactor: make coverage ratchet script idiomatic TypeScript - Wrap in main() with process.exitCode instead of scattered process.exit() - Replace mutable flags with .map()/.some() over typed MetricResult[] - Separate pure logic (checkMetrics) from formatting (formatReport) - Throw with { cause } chaining instead of exit-in-helpers - Derive CoverageThresholds from METRICS tuple (single source of truth) - Exhaustive switch on CheckStatus discriminated union * refactor: remove duplication in coverage ratchet script - Drop STATUS_LABELS map; inline labels in exhaustive switch - Extract common 'metric coverage is N%' preamble in formatResult - Simplify ratchetedThresholds: use results directly (already in METRICS order) instead of re-scanning with .find() per metric - Compute 'failed' once in main, pass into formatReport to avoid duplicate .some() scan * refactor: simplify coverage ratchet with FP patterns - Extract classify() as a named pure function (replaces nested ternary) - loadJSON takes repo-relative paths, eliminating THRESHOLD_PATH and SUMMARY_PATH constants (DRY the join-with-REPO_ROOT pattern) - Drop CoverageMetric/CoverageSummary interfaces (only pct is read); use structural type at the call site instead - Inline ratchetedThresholds (one-liner, used once) - formatReport derives fail/improved from results instead of taking a pre-computed boolean (let functions derive from data, don't thread derived state) - sections.join("\n\n") replaces manual empty-string pushing - Shorter type names (Thresholds, Status, Result) — no ambiguity in a single-purpose script * refactor: strip coverage ratchet to failure-only output prek hides output from commands that exit 0, so ok/improved reporting was dead code. Remove Status, Result, classify, formatResult, formatReport, and the ratcheted-thresholds suggestion block. The script now just filters for regressions and prints actionable errors on failure. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Brandon Pelfrey <bpelfrey@nvidia.com> * fix: use CONNECT tunnel for WebSocket endpoints in Discord/Slack presets (NVIDIA#438) * fix: use CONNECT tunnel for WebSocket endpoints in Discord/Slack presets The egress proxy's HTTP idle timeout (~2 min) kills long-lived WebSocket connections when endpoints are configured with protocol:rest + tls:terminate. Switch WebSocket endpoints to access:full (CONNECT tunnel) which bypasses HTTP-level timeouts entirely. Discord: - gateway.discord.gg → access:full (WebSocket gateway) - Add PUT/PATCH/DELETE methods for discord.com (message editing, reactions) - Add media.discordapp.net for attachment access Slack: - Add wss-primary.slack.com and wss-backup.slack.com → access:full (Socket Mode WebSocket endpoints) Partially addresses NVIDIA#409 — the policy-level fix enables WebSocket connections to survive. The hardcoded 2-min timeout in openshell-sandbox still affects any protocol:rest endpoints with long-lived connections. Related: NVIDIA#361 (WhatsApp Web, same root cause) * fix: correct comment wording for media endpoint and YAML formatting * fix: standardize Node.js minimum version to 22.16 (NVIDIA#840) * fix: remove unused RECOMMENDED_NODE_MAJOR from scripts/install.sh Shellcheck flagged it as unused after the min/recommended merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: enforce full semver >=22.16.0 in installer scripts The runtime checks only compared the major Node.js version, allowing 22.0–22.15 to pass despite package.json requiring >=22.16.0. Use the version_gte() helper for full semver comparison in both installers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: harden version_gte and align fallback message Guard version_gte() against prerelease suffixes (e.g. "22.16.0-rc.1") that would crash bash arithmetic. Also update the manual-install fallback message to reference MIN_NODE_VERSION instead of hardcoded "22". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update test stubs for Node.js 22.16 minimum and add Node 20 rejection test - Bump node stub in 'succeeds with acceptable Node.js' from v20.0.0 to v22.16.0 - Bump node stub in buildCurlPipeEnv from v22.14.0 to v22.16.0 - Add new test asserting Node.js 20 is rejected by ensure_supported_runtime --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: harden installer and onboard resiliency (NVIDIA#961) * fix: harden installer and onboard resiliency * fix: address installer and debug review follow-ups * fix: harden onboard resume across later setup steps * test: simplify payload extraction in onboard tests * test: keep onboard payload extraction target-compatible * chore: align onboard session lint with complexity rule * fix: harden onboard session safety and lock handling * fix: tighten onboard session redaction and metadata handling * fix(security): strip credentials from migration snapshots and enforce blueprint digest (NVIDIA#769) Reconciles NVIDIA#156 and NVIDIA#743 into a single comprehensive solution: - Filter auth-profiles.json at copy time via cpSync filter (from NVIDIA#743) - Recursive stripCredentials() with pattern-based field detection for deep config sanitization (from NVIDIA#156: CREDENTIAL_FIELDS set + CREDENTIAL_FIELD_PATTERN regex) - Remove gateway config section (contains auth tokens) from sandbox openclaw.json - Blueprint digest verification (SHA-256): recorded at snapshot time, validated on restore, empty/missing digest is a hard failure - computeFileDigest() throws when blueprint file is missing instead of silently returning null - Sanitize both snapshot-level and sandbox-bundle openclaw.json copies - Backward compatible: old snapshots without blueprintDigest skip validation - Bump SNAPSHOT_VERSION 2 → 3 Supersedes NVIDIA#156 and NVIDIA#743. * fix(sandbox): export proxy env vars with full NO_PROXY and persist across reconnects (NVIDIA#1025) * fix(sandbox): export proxy env vars with full NO_PROXY and persist across reconnects OpenShell injects NO_PROXY=127.0.0.1,localhost,::1 into the sandbox, missing inference.local and the gateway IP (10.200.0.1). This causes LLM inference requests to route through the egress proxy instead of going direct, and the proxy gateway IP itself gets proxied. Add proxy configuration block to nemoclaw-start.sh that: - Exports HTTP_PROXY, HTTPS_PROXY, and NO_PROXY with inference.local and the gateway IP included - Persists via /etc/profile.d/nemoclaw-proxy.sh (root) or ~/.profile (non-root fallback) so values survive OpenShell reconnect injection - Supports NEMOCLAW_PROXY_HOST / NEMOCLAW_PROXY_PORT overrides The non-root fallback ensures the fix works in environments like Brev where containers run without root privileges. Tested on DGX Spark (ARM64) and Brev VM (x86_64). Verified NO_PROXY contains inference.local and 10.200.0.1 inside the live sandbox after connect. Ref: NVIDIA#626, NVIDIA#704 Ref: NVIDIA#704 (comment) * fix(sandbox): write proxy config to ~/.bashrc for interactive reconnect sessions OpenShell's `sandbox connect` spawns `/bin/bash -i` (interactive, non-login), which sources ~/.bashrc — not ~/.profile or /etc/profile.d/*. The previous approach wrote to ~/.profile and /etc/profile.d/, neither of which is sourced by `bash -i`, so the narrow OpenShell-injected NO_PROXY persisted in live interactive sessions. Changes: - Write proxy snippet to ~/.bashrc (primary) and ~/.profile (login fallback) - Export both uppercase and lowercase proxy variants (NO_PROXY + no_proxy, HTTP_PROXY + http_proxy, etc.) — Node.js undici prefers lowercase no_proxy over uppercase NO_PROXY when both are set - Add idempotency guard to prevent duplicate blocks on container restart - Update tests: verify .bashrc writing, idempotency, bash -i override behavior, and lowercase variant correctness Tested on DGX Spark (ARM64) and Brev VM (x86_64) with full destroy + re-onboard + live `env | grep proxy` verification inside the sandbox shell via `openshell sandbox connect`. Ref: NVIDIA#626 * fix(sandbox): replace stale proxy values on restart with begin/end markers Use begin/end markers in .bashrc/.profile proxy snippet so _write_proxy_snippet replaces the block when PROXY_HOST/PORT change instead of silently keeping stale values. Adds test coverage for the replacement path. Addresses CodeRabbit review feedback on idempotency gap. * fix(sandbox): resolve sandbox user home dynamically when running as root When the entrypoint runs as root, $HOME is /root — the proxy snippet was written to /root/.bashrc instead of the sandbox user's home. Use getent passwd to look up the sandbox user's home when running as UID 0; fall back to /sandbox if the user entry is missing. Addresses CodeRabbit review feedback on _SANDBOX_HOME resolution. --------- Co-authored-by: Carlos Villela <cvillela@nvidia.com> * fix(policies): preset application for versionless policies (Fixes NVIDIA#35) (NVIDIA#101) * fix(policies): allow preset application for versionless policies (Fixes NVIDIA#35) Fixes NVIDIA#35 Signed-off-by: Deepak Jain <deepujain@gmail.com> * fix: remove stale complexity suppression in policies --------- Signed-off-by: Deepak Jain <deepujain@gmail.com> Co-authored-by: Kevin Jones <kejones@nvidia.com> * fix: restore routed inference and connect UX (NVIDIA#1037) * fix: restore routed inference and connect UX * fix: simplify detected local inference hint * fix: remove stale local inference hint * test: relax connect forward assertion --------- Signed-off-by: peteryuqin <peter.yuqin@gmail.com> Signed-off-by: Deepak Jain <deepujain@gmail.com> Co-authored-by: KJ <kejones@nvidia.com> Co-authored-by: Emily Wilkins <80470879+epwilkins@users.noreply.github.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Peter <peter.yuqin@gmail.com> Co-authored-by: Brandon Pelfrey <bpelfrey@nvidia.com> Co-authored-by: Benedikt Schackenberg <6381261+BenediktSchackenberg@users.noreply.github.com> Co-authored-by: Lucas Wang <lucas_wang@lucas-futures.com> Co-authored-by: senthilr-nv <senthilr@nvidia.com> Co-authored-by: Deepak Jain <deepujain@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
- inference-config.test.js: use INFERENCE_ROUTE_URL for ollama-local (PR NVIDIA#1037 fixed inference.local routing, host-gateway bypass removed) - local-inference.test.js: getOllamaModelOptions no longer takes gpu param; Jetson fallback moved to getBootstrapOllamaModelOptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- inference-config.test.js: use INFERENCE_ROUTE_URL for ollama-local (PR NVIDIA#1037 fixed inference.local routing, host-gateway bypass removed) - local-inference.test.js: getOllamaModelOptions no longer takes gpu param; Jetson fallback moved to getBootstrapOllamaModelOptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- inference-config.test.js: use INFERENCE_ROUTE_URL for ollama-local (PR NVIDIA#1037 fixed inference.local routing, host-gateway bypass removed) - local-inference.test.js: getOllamaModelOptions no longer takes gpu param; Jetson fallback moved to getBootstrapOllamaModelOptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- inference-config.test.js: use INFERENCE_ROUTE_URL for ollama-local (PR NVIDIA#1037 fixed inference.local routing, host-gateway bypass removed) - local-inference.test.js: getOllamaModelOptions no longer takes gpu param; Jetson fallback moved to getBootstrapOllamaModelOptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: restore routed inference and connect UX * fix: simplify detected local inference hint * fix: remove stale local inference hint * test: relax connect forward assertion
- inference-config.test.js: use INFERENCE_ROUTE_URL for ollama-local (PR NVIDIA#1037 fixed inference.local routing, host-gateway bypass removed) - local-inference.test.js: getOllamaModelOptions no longer takes gpu param; Jetson fallback moved to getBootstrapOllamaModelOptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- inference-config.test.js: use INFERENCE_ROUTE_URL for ollama-local (PR NVIDIA#1037 fixed inference.local routing, host-gateway bypass removed) - local-inference.test.js: getOllamaModelOptions no longer takes gpu param; Jetson fallback moved to getBootstrapOllamaModelOptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: restore routed inference and connect UX * fix: simplify detected local inference hint * fix: remove stale local inference hint * test: relax connect forward assertion
* fix: restore routed inference and connect UX * fix: simplify detected local inference hint * fix: remove stale local inference hint * test: relax connect forward assertion
* fix: restore routed inference and connect UX * fix: simplify detected local inference hint * fix: remove stale local inference hint * test: relax connect forward assertion
Summary
inference.localfrom the sandboxNO_PROXYoverride so routed inference still goes through the OpenShell proxy pathnemoclaw <sandbox> connectRoot Cause
A plain OpenShell sandbox on
openshell 0.0.16could resolvehttps://inference.local/v1/models, but a NemoClaw runtime sandbox could not. The difference was NemoClaw's runtime env override inscripts/nemoclaw-start.sh, which addedinference.localtoNO_PROXYand forced direct DNS resolution inside the sandbox. That bypassed the OpenShell proxy path that actually handlesinference.local.Separately,
nemoclaw <sandbox> connectwas callingopenshell forward start --background 18789 ...beforeopenshell sandbox connect, which produced noisy duplicate-forward errors on already-connected sandboxes.Validation
Real-world repro
curl -sS https://inference.local/v1/modelsworkedNO_PROXY=...,inference.local,...:curl: (6) Could not resolve host: inference.localinference.localfromNO_PROXY, routed inference stays on the proxy path as intendedSummary by CodeRabbit