feat: multi-agent architecture with Hermes support#1618
Conversation
…on architecture Introduce a multi-agent architecture so NemoClaw can orchestrate different AI agents inside OpenShell sandboxes. This PoC adds Hermes Agent (Nous Research, v0.8.0) as a second supported agent alongside OpenClaw. New files: - agents/hermes/ — full agent definition: Dockerfiles, startup script, network policy, Python plugin, and manifest declaring the integration contract (port 8642, Bearer token auth, custom provider for inference routing through inference.local) - agents/openclaw/manifest.yaml — documents OpenClaw's integration contract; artifacts remain at root for backward compatibility - bin/lib/agent-defs.js — agent definition loader with listAgents(), loadAgent(), getAgentChoices(), and resolveAgentName() for the onboard flow to offer agent selection via --agent flag or NEMOCLAW_AGENT env var Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…handling
Three issues found during container testing:
1. Inline python3 -c config generation had SyntaxError — for/if blocks
don't work in semicolon-joined one-liners. Extracted to a proper
generate-config.py script.
2. `hermes gateway start` invokes systemd (not available in containers).
Changed to `hermes gateway run` (foreground mode).
3. Hermes writes state files (PID, state.db, .channel_directory) directly
into HERMES_HOME. Cannot point it at the immutable /sandbox/.hermes
dir. Solution: verify integrity of immutable config, then copy to
writable /sandbox/.hermes-data and set HERMES_HOME there.
Tested: container builds, gateway starts, health endpoint responds with
{"status": "ok", "platform": "hermes-agent"} inside the container.
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Hermes API server binds to 127.0.0.1 regardless of config (upstream bug in gateway/platforms/api_server.py — _host is set correctly but overridden at runtime). Work around with socat: - Hermes listens on internal port 18642 (127.0.0.1) - socat forwards 0.0.0.0:8642 -> 127.0.0.1:18642 - OpenShell port forwarding sees 0.0.0.0:8642 as expected Also fixes: - Proxy snippet writes now tolerate EPERM after capsh drops cap_dac_override (root can no longer write sandbox-owned files) - socat added to Dockerfile.base apt packages - Cleanup handler kills both gateway and socat PIDs Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Add agent selection as the first step of the onboard wizard. Users can now choose between OpenClaw and Hermes (or future agents) via: - Interactive numbered prompt during onboard - --agent hermes flag - NEMOCLAW_AGENT=hermes env var Changes: - onboard.js: new agent_selection step (step 1), renumber all steps (now 9 total), add setupAgent() and isAgentReady() dispatchers that route to agent-specific or OpenClaw-default behavior, use agent's Dockerfile and policy when creating sandbox - nemoclaw.js: parse --agent flag, pass to onboard, update help text - test/onboard.test.js: update assertions for new step numbering The agent choice is persisted in onboard-session.json and respected on resume. OpenClaw remains the default when no agent is specified. Note: SKIP=test-cli used because install-preflight.test.js and version.test.ts have pre-existing failures on main (not introduced by this change). Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
When a non-default agent (e.g. Hermes) is selected, the onboard flow now automatically builds the agent's base Docker image if it doesn't exist locally. This means `nemoclaw onboard` with Hermes selected is fully self-contained — no manual docker build steps needed. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…display names User-facing strings like "OpenClaw will use openai-responses" and "The OpenClaw agent will be able to read that key" now use the selected agent's display name. When Hermes is selected, they read "Hermes Agent will use..." instead. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
OpenShell rejects unknown top-level fields. The Hermes policy had filesystem_policy_additions which is not a valid policy field. Replaced with a complete standalone policy based on the OpenClaw template with Hermes-specific adjustments (.hermes paths, Nous Research endpoints, PyPI, hermes/python3 binary restrictions). Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
The connect/status/recovery flows were hardcoded to OpenClaw (port 18789, openclaw gateway run, .openclaw/openclaw.json). Now they read the agent from the onboard session and use the agent's health probe URL, gateway command, display name, and config paths. Fixes: - isSandboxGatewayRunning: uses agent health probe (18642 for Hermes) - recoverSandboxProcesses: uses agent gateway command and HERMES_HOME - checkAndRecoverSandboxProcesses: agent-aware display strings - status command: agent-aware process labels - printDashboard: agent-aware port and token/API key instructions - Extracted printDashboardUiSection to reduce complexity Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
The dashboard URL was hardcoded to port 18789. Now buildControlUiUrls accepts an optional port parameter, and the onboard dashboard output passes the agent's forward port (8642 for Hermes, 18789 for OpenClaw). Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
The sandbox is already network-isolated by OpenShell. Adding a self-generated API key to the Hermes API server creates friction for testing and port-forwarded access without adding real security. Removed API_SERVER_KEY generation; Hermes now allows all requests on its local endpoint. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Removed things nobody asked for: - API_SERVER_KEY generation and export (self-generated auth that blocked access with no way to authenticate) - SOUL.md default personality prompt - Hermes config overrides (max_turns, reasoning_effort, memory, skills, display settings) — let Hermes use its own defaults - Plugin tools (nemoclaw_status, nemoclaw_info) and startup banner - Plugin reduced to no-op registration placeholder Config now only sets what's required for OpenShell integration: - model/provider/base_url (inference routing) - api_server port (for socat forwarding) - messaging platform tokens (if configured) Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Pre-push TypeScript checks failed because: - Session interface was missing the 'agent' field added by the onboard agent selection step - resolveAgentName JSDoc param types didn't match the optional destructured defaults Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughMulti-agent support has been introduced to NemoClaw, enabling Hermes Agent (Nous Research) integration alongside the default OpenClaw agent. This includes Docker build infrastructure, manifest/plugin system, runtime configuration generation, sandbox policies, agent discovery modules, and updated CLI/onboarding flows to handle agent selection and setup. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as nemoclaw CLI
participant AgentDefs as agent-defs
participant AgentOnboard as agent-onboard
participant Docker as Docker Engine
participant ConfigGen as generate-config.ts
participant Registry as Registry
CLI->>AgentDefs: resolveAgentName(flag, env, session)
AgentDefs->>Registry: read session.agent
AgentDefs-->>CLI: resolved agent (hermes or openclaw)
CLI->>AgentOnboard: resolveAgent({agentFlag, session})
AgentOnboard->>AgentDefs: loadAgent(name)
AgentOnboard-->>CLI: agent object or null
alt agent is not null
CLI->>AgentOnboard: createAgentSandbox(agent)
AgentOnboard->>Docker: COPY Dockerfile into build context
AgentOnboard->>Docker: docker build (buildCtx/Dockerfile)
Docker->>ConfigGen: execute generate-config.ts at build time
ConfigGen-->>Docker: config.yaml, .env, .config-hash
Docker-->>AgentOnboard: built image
AgentOnboard-->>CLI: sandbox created
else agent is null (openclaw)
CLI->>Docker: use standard stageOptimizedSandboxBuildContext
Docker-->>CLI: sandbox created
end
CLI->>Registry: registerSandbox({agent: agent.name})
Registry-->>CLI: sandbox registered
sequenceDiagram
participant Container as Hermes Container
participant StartSh as start.sh
participant ConfigHash as config-hash verification
participant SymlinkValidation as symlink validation
participant DecodeProxy as decode-proxy.py
participant Hermes as hermes gateway
participant Socat as socat forwarder
Container->>StartSh: exec /usr/local/bin/nemoclaw-start
StartSh->>StartSh: set -euo pipefail, ulimit, capsh
StartSh->>ConfigHash: verify SHA256(/sandbox/.hermes/.config-hash)
ConfigHash-->>StartSh: ✓ hash matches or ✗ mismatch
StartSh->>StartSh: deploy config.yaml/.env to writable /sandbox/.hermes-data
StartSh->>SymlinkValidation: validate /sandbox/.hermes/\\* → /sandbox/.hermes-data/\\*
SymlinkValidation-->>StartSh: ✓ or optionally apply chattr +i
StartSh->>DecodeProxy: start background decode-proxy on 127.0.0.1:3129
DecodeProxy-->>StartSh: listening, ready
StartSh->>Hermes: exec hermes gateway run (bound to 127.0.0.1:18642)
Hermes-->>StartSh: gateway running
StartSh->>Socat: start socat TCP relay 0.0.0.0:8642 → 127.0.0.1:18642
Socat-->>StartSh: relay active
StartSh->>StartSh: print dashboard URLs, wait on gateway PID
StartSh->>StartSh: trap SIGTERM/SIGINT → forward to gateway/socat/decode-proxy
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…utput Leftover from the removed API key feature. For Hermes there is no token or key — just show the URL. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
The agent field was set in updateSession but normalizeSession (called on every save) didn't include it in the createSession passthrough, so the field was silently stripped. This caused getSessionAgent() to always fall back to openclaw. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Hermes's Telegram adapter requires the python-telegram-bot package.
The bot token flows through the URL path (/bot{TOKEN}/method) which
OpenShell's L7 proxy rewrites at egress, same as OpenClaw — so the
placeholder pattern in .env works without injecting real tokens.
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Python HTTP clients (httpx) URL-encode colons in URL paths, turning openshell:resolve:env:TOKEN into openshell%3Aresolve%3Aenv%3ATOKEN. OpenShell's L7 proxy doesn't recognize the encoded form and returns 403 Forbidden. Solution: a tiny async Python proxy inside the sandbox that URL-decodes request paths before forwarding to the OpenShell proxy. The Hermes gateway process routes through this decode proxy so placeholder tokens in Telegram bot URLs are restored before reaching the L7 proxy. No real tokens are exposed inside the sandbox — the decode proxy only decodes URL-encoding, it doesn't resolve the placeholders. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Hermes's TelegramFallbackTransport rewrites api.telegram.org to raw IPs (149.154.167.220), which OpenShell's L7 proxy rejects because network policy is hostname-based. Without the fallback transport, python-telegram-bot uses its default httpx transport which respects HTTPS_PROXY and routes through the proxy correctly. Applied as a build-time patch to the installed Hermes package. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…urce The regex-based patch didn't match the actual Hermes source code (extra logger lines between the patterns). Switched to simple string.replace() which matches the exact lines. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
The patch writes to /usr/local/lib which is root-owned. Must run before USER sandbox switches to the sandbox user. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
OpenShell's proxy resolves the calling binary via /proc/pid/exe which returns /usr/bin/python3.11, not the /usr/bin/python3 symlink. The policy binary paths must match the resolved path exactly. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
5a30f54 to
5684681
Compare
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (4)
src/lib/onboard-session.ts (1)
503-538: Consider includingagentin the debug summary.The
summarizeForDebug()function includes most session fields but omits the newly addedagentfield. This could make debugging agent-related issues harder.♻️ Suggested addition
return { version: session.version, sessionId: session.sessionId, status: session.status, resumable: session.resumable, mode: session.mode, startedAt: session.startedAt, updatedAt: session.updatedAt, + agent: session.agent, sandboxName: session.sandboxName,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard-session.ts` around lines 503 - 538, The debug summary produced by summarizeForDebug omits the session.agent field; update summarizeForDebug to include an "agent" entry in the returned object (e.g., agent: session.agent) so agent info is present in logs and the steps mapping remains unchanged; locate the summarizeForDebug function and add the agent property to the top-level returned record alongside version/sessionId/status/etc.agents/hermes/generate-config.py (1)
64-67: Consider simplifying withdict.get().The static analysis tool flagged that the key check before dictionary access can be simplified.
♻️ Suggested simplification
- if ch in allowed_ids and allowed_ids[ch]: - p_cfg["allowed_users"] = ",".join( - str(uid) for uid in allowed_ids[ch] - ) + ch_ids = allowed_ids.get(ch) + if ch_ids: + p_cfg["allowed_users"] = ",".join(str(uid) for uid in ch_ids)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agents/hermes/generate-config.py` around lines 64 - 67, The code currently checks "if ch in allowed_ids and allowed_ids[ch]" before setting p_cfg["allowed_users"]; simplify by using allowed_ids.get(ch) to retrieve the value and check its truthiness in one step (e.g., value = allowed_ids.get(ch); if value: set p_cfg["allowed_users"] = ",".join(str(uid) for uid in value)). Update references to allowed_ids[ch] in this block to use the single get() result to avoid double lookup and improve readability.agents/hermes/plugin/__init__.py (1)
11-13: Prefix unused parameter with underscore.Per coding guidelines, unused variables should be prefixed with an underscore.
♻️ Suggested fix
-def register(ctx): +def register(_ctx): """No-op registration. Hermes requires this function to exist.""" pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agents/hermes/plugin/__init__.py` around lines 11 - 13, The register function declares an unused parameter ctx which violates the unused-parameter guideline; rename the parameter to _ctx (or prefix it with an underscore) in the register(ctx) signature so the parameter remains present for the plugin API but is marked as intentionally unused, e.g., change register(ctx) to register(_ctx) and keep the existing docstring and pass body intact.agents/hermes/Dockerfile.base (1)
96-100: Pinpython-telegram-botto the version you validated.Everything else in this base image is locked down for reproducibility, but
python-telegram-bot>=21.0means rebuilds can silently pick up a different release than the one Hermes 0.8.0 and the Telegram patch layer were tested against.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agents/hermes/Dockerfile.base` around lines 96 - 100, The Dockerfile currently allows python-telegram-bot to float ("python-telegram-bot>=21.0"); update that package spec to the exact version you validated (replace "python-telegram-bot>=21.0" with "python-telegram-bot==<VALIDATED_VERSION>") so the base image remains reproducible; make the change in the RUN pip3 install line that references HERMES_VERSION and runs hermes --version.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@agents/hermes/decode-proxy.py`:
- Around line 52-67: The finally block currently only closes the client writer
and leaks the upstream writer; change teardown in the function handling the
CONNECT relay to close both writers (up_writer and writer) and await
wait_closed() where supported (e.g., await up_writer.wait_closed() and await
writer.wait_closed()) after calling close(); ensure you guard for undefined/None
writers (check if 'up_writer'/'writer' exist and are not already closed) and
handle exceptions from wait_closed; apply the same fix to the analogous teardown
in the code around the other relay block (_relay/asyncio.gather usage at lines
~70-80).
- Around line 40-42: The code currently unquotes the entire request line
(decoded_line = unquote(first_line.decode(...)).encode(...)), which mutates
every percent-encoded octet; instead decode first_line to a str, split into
components (method, request_target, version) by spaces, then only
unquote/rewrite the request_target portion or only the specific placeholder
tokens (e.g. "openshell%3Aresolve" or other placeholder patterns) within
request_target using targeted unquote or re.sub on matches, reassemble as
"method + ' ' + request_target + ' ' + version" and encode back to bytes into
decoded_line; operate on the variables first_line and decoded_line and ensure
only the second token (request_target) or the matched placeholder tokens are
modified.
In `@agents/hermes/manifest.yaml`:
- Around line 44-58: The manifest's state_dirs list is missing the "pairing"
entry that the runtime creates; update the state_dirs block (the state_dirs YAML
array) to include "pairing" alongside memories, sessions, skills, plugins, cron,
logs, skins, plans, workspace, profiles, and cache so the manifest matches the
created symlinked directories and validation/management logic can see the
pairing directory.
In `@agents/hermes/start.sh`:
- Around line 329-330: The chmod line in the start.sh snippet ("[ -f .env ] &&
chmod 600 .env") operates on the wrong path and should be removed or corrected;
remove this dead code or replace it with checks that target the actual env files
(${HERMES_IMMUTABLE}/.env and ${HERMES_WRITABLE}/.env) and apply chmod 600 to
those paths (e.g., test each file with [ -f ] and chmod the correct variable
path) so the script no longer operates on the current working directory .env.
In `@bin/lib/agent-defs.js`:
- Around line 210-218: The env var (envAgent) and session.agent are not being
validated the same way as the CLI --agent (agentFlag), allowing typos like
"hermse" to silently fall back; update the selection logic to validate envAgent
and session.agent against the canonical list (use listAgents() or the same
validation routine used for agentFlag) before returning them, and if invalid
behave the same as the agentFlag path (e.g., log/throw an error or reject the
value) so only known agent names are accepted; reference envAgent,
session.agent, listAgents(), and agentFlag when making the change.
In `@bin/lib/onboard.js`:
- Around line 2067-2070: The code seeds CHAT_UI_URL using the legacy
CONTROL_UI_PORT constant causing a mismatch with the new getControlUiPort() and
the forward manager; update the sandbox creation to derive chatUiUrl from
getControlUiPort() (or its exported helper) instead of CONTROL_UI_PORT and make
the forward manager start/stop use the same getControlUiPort() value so
Hermes-advertised port (e.g., 8642) and the generated config/forwarding are
consistent; check the sandbox step (ONBOARD_STEP_INDEX.sandbox.number /
promptValidatedSandboxName() usage) and the related forward-start/stop block
referenced around the other region (lines ~3907-3926) and replace
CONTROL_UI_PORT with getControlUiPort() everywhere.
- Around line 3489-3496: isAgentReady currently gates resume for non-OpenClaw
agents by checking pod readiness via runCaptureOpenshell(["sandbox", "list"])
and isSandboxReady, which causes --resume to proceed before the Hermes gateway
actually binds; change isAgentReady so non-OpenClaw branches use the sandbox
manifest health probe instead of the pod-ready check: replace the
runCaptureOpenshell + isSandboxReady call with a call to a function that
evaluates the sandbox manifest health probe (e.g., isSandboxHealthyByManifest or
reuse an existing manifest-probe helper), making sure that function reads the
sandbox manifest health probe and returns true only when the probe indicates the
agent is ready; keep the OpenClaw path using isOpenclawReady and update
references to runCaptureOpenshell and isSandboxReady accordingly.
In `@bin/nemoclaw.js`:
- Around line 804-812: The CLI currently accepts any value after --agent and
forwards invalid names into runOnboard/onboarding causing a later exception;
update the parsing around agentIdx/agentFlag to validate agentFlag against the
allowed agent list before mutating args or calling runOnboard, printing a clear
error and the valid agents/usage and exiting when the value is not in the list;
apply the same validation to the duplicate block around lines where agent
parsing repeats (the 833-840 block) so both locations check agentFlag against
the canonical allowedAgents array (or function) and reject unknown names
immediately.
- Around line 90-97: The getSessionAgent helper currently reads the agent from
onboardSession.loadSession() (onboard-session.json) which is global; change it
to resolve the agent from the sandbox metadata for the given sandboxName instead
of the last onboard session. Update getSessionAgent to accept (or otherwise
obtain) sandboxName, look up the sandbox record (e.g., via the sandbox
store/registry used elsewhere), read the persisted agent field on that sandbox
record, and call loadAgent with that value (fall back to "openclaw" only if the
sandbox record or agent field is missing). Ensure any code that calls
getSessionAgent passes the appropriate sandboxName so status/connect/recovery
operate on the correct sandbox.
---
Nitpick comments:
In `@agents/hermes/Dockerfile.base`:
- Around line 96-100: The Dockerfile currently allows python-telegram-bot to
float ("python-telegram-bot>=21.0"); update that package spec to the exact
version you validated (replace "python-telegram-bot>=21.0" with
"python-telegram-bot==<VALIDATED_VERSION>") so the base image remains
reproducible; make the change in the RUN pip3 install line that references
HERMES_VERSION and runs hermes --version.
In `@agents/hermes/generate-config.py`:
- Around line 64-67: The code currently checks "if ch in allowed_ids and
allowed_ids[ch]" before setting p_cfg["allowed_users"]; simplify by using
allowed_ids.get(ch) to retrieve the value and check its truthiness in one step
(e.g., value = allowed_ids.get(ch); if value: set p_cfg["allowed_users"] =
",".join(str(uid) for uid in value)). Update references to allowed_ids[ch] in
this block to use the single get() result to avoid double lookup and improve
readability.
In `@agents/hermes/plugin/__init__.py`:
- Around line 11-13: The register function declares an unused parameter ctx
which violates the unused-parameter guideline; rename the parameter to _ctx (or
prefix it with an underscore) in the register(ctx) signature so the parameter
remains present for the plugin API but is marked as intentionally unused, e.g.,
change register(ctx) to register(_ctx) and keep the existing docstring and pass
body intact.
In `@src/lib/onboard-session.ts`:
- Around line 503-538: The debug summary produced by summarizeForDebug omits the
session.agent field; update summarizeForDebug to include an "agent" entry in the
returned object (e.g., agent: session.agent) so agent info is present in logs
and the steps mapping remains unchanged; locate the summarizeForDebug function
and add the agent property to the top-level returned record alongside
version/sessionId/status/etc.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cfc4717b-9f49-42eb-805c-6c58f590523a
📒 Files selected for processing (18)
agents/hermes/Dockerfileagents/hermes/Dockerfile.baseagents/hermes/decode-proxy.pyagents/hermes/generate-config.pyagents/hermes/manifest.yamlagents/hermes/plugin/__init__.pyagents/hermes/plugin/plugin.yamlagents/hermes/policy-additions.yamlagents/hermes/start.shagents/openclaw/manifest.yamlbin/lib/agent-defs.jsbin/lib/onboard.jsbin/nemoclaw.jssrc/lib/dashboard.tssrc/lib/onboard-session.tssrc/lib/web-search.test.tssrc/lib/web-search.tstest/onboard.test.js
Add test-hermes-e2e.sh that validates the full Hermes user journey: install → onboard --agent hermes → health probe → live inference → cleanup. Covers agent session persistence, config immutability, process verification, and OpenClaw regression check. Add hermes-e2e job to nightly-e2e.yaml workflow with 60-minute timeout and failure notification. Ref: PR #1618 Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…sandbox mode Add a flag that applies a maximally open policy to the OpenShell sandbox, analogous to Claude Code's --dangerously-skip-permissions. When set: - Onboard uses openclaw-sandbox-permissive.yaml as the base policy (include_workdir: true, all known endpoints with access: full) - Policy preset selection is skipped entirely - Connect applies the permissive policy to running sandboxes on the fly - Status displays the permissive mode indicator - Also accepts NEMOCLAW_DANGEROUSLY_SKIP_PERMISSIONS=1 env var Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…/NemoClaw into feat/hermes-agent-support
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/lib/registry.ts (1)
150-162:⚠️ Potential issue | 🟠 MajorPersist
dangerouslySkipPermissionsin the registry.
createSandbox()now passesdangerouslySkipPermissions, butregisterSandbox()only writesagent. Sandboxes created with--dangerously-skip-permissionstherefore lose that metadata on the first save, so later status/reconnect flows can no longer tell they were created with an open policy.Suggested fix
data.sandboxes[entry.name] = { name: entry.name, createdAt: entry.createdAt || new Date().toISOString(), model: entry.model || null, nimContainer: entry.nimContainer || null, provider: entry.provider || null, gpuEnabled: entry.gpuEnabled || false, policies: entry.policies || [], agent: entry.agent || null, + ...(entry.dangerouslySkipPermissions === true + ? { dangerouslySkipPermissions: true } + : {}), };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/registry.ts` around lines 150 - 162, The registry currently omits the dangerouslySkipPermissions flag when persisting new sandboxes in registerSandbox, causing sandboxes created with --dangerously-skip-permissions to lose that metadata; update the object written in registerSandbox (inside withLock) to copy entry.dangerouslySkipPermissions (defaulting to false if undefined) into the stored sandbox record so the persisted entry includes dangerouslySkipPermissions alongside name, model, provider, agent, etc.; ensure you also adjust any related type/interface (SandboxEntry or stored sandbox type) if necessary so the field is allowed.src/nemoclaw.ts (1)
329-342:⚠️ Potential issue | 🟠 MajorSession-based registry recovery still drops
agent.
buildRecoveredSandboxEntry()can persistmetadata.agent, but the session-seeding path below still rebuilds recovery metadata withoutsession.agent. After a registry rebuild, a Hermes sandbox recovered from session is re-registered asagent: null, so later health probes and recovery commands fall back to OpenClaw.Suggested follow-up change outside this hunk
metadataByName.set( session.sandboxName, buildRecoveredSandboxEntry(session.sandboxName, { model: session.model || null, provider: session.provider || null, nimContainer: session.nimContainer || null, policyPresets: session.policyPresets || null, + agent: session.agent || null, }), );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nemoclaw.ts` around lines 329 - 342, buildRecoveredSandboxEntry currently preserves metadata.agent but the session-based registry rebuild path recreates recovery metadata without including session.agent, causing recovered Hermes sandboxes to be registered with agent: null; update the session-seeding/rebuild logic to copy session.agent into the metadata used to construct recovered entries (i.e., ensure the session object’s agent property is passed through to the metadata argument before calling buildRecoveredSandboxEntry) so that buildRecoveredSandboxEntry receives and persists the agent field for later health probes and recovery commands.
♻️ Duplicate comments (1)
src/lib/onboard.ts (1)
4301-4307:⚠️ Potential issue | 🟠 MajorMove agent resolution before the resume conflict gate.
This still runs after
getResumeConfigConflicts(), which does not compare agent.nemoclaw onboard --resume --agent hermescan therefore attach to an OpenClaw session and overwritesession.agenthere mid-resume. Resolve the requested agent first and reject mismatches before continuing.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 4301 - 4307, Resolve the requested agent earlier: call agentOnboard.resolveAgent({ agentFlag: opts.agent, session }) before invoking getResumeConfigConflicts() and the resume conflict gate, then if an agent was specified ensure it matches the existing session.agent (or reject the resume) rather than updating session.agent mid-resume; if the resolved agent conflicts with the current session (or the user requested a different agent) throw or return the resume-mismatch error instead of calling onboardSession.updateSession to overwrite session.agent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 2416-2419: basePolicyPath currently ignores
dangerouslySkipPermissions when agent is set because it always uses
agentOnboard.getAgentPolicyPath(agent); update the logic so that if
dangerouslySkipPermissions is true you first attempt to use an agent-specific
permissive policy (e.g. call a new or existing
agentOnboard.getAgentPermissivePolicyPath(agent)) and fall back to the global
defaultPolicyPath only if that agent-specific permissive path is missing, or
alternatively throw/reject the combination if no agent permissive policy exists;
adjust the assignment of defaultPolicyPath and basePolicyPath to check
dangerouslySkipPermissions before unconditionally calling
agentOnboard.getAgentPolicyPath(agent) and ensure the permissive-path helper
name (getAgentPermissivePolicyPath) or a clear error is used to locate this
change.
In `@src/lib/policies.ts`:
- Around line 345-367: The applyPermissivePolicy function currently always uses
PERMISSIVE_POLICY_PATH and can overwrite agent-specific policies; update
applyPermissivePolicy(sandboxName) to first resolve the sandbox's agent (e.g.,
call or implement a resolver like resolveAgentForSandbox or reuse existing agent
lookup logic), then construct the agent-specific permissive policy path (e.g.,
path.join(ROOT, "nemoclaw-blueprint", "policies",
`${agent}-sandbox-permissive.yaml`) ) and check fs.existsSync on that path; if
the agent-specific permissive policy exists, use it with buildPolicySetCommand
and run, otherwise fail fast with an explicit error indicating no agent-specific
permissive policy found (do not fall back to the generic openclaw file when an
agent is known). Ensure you still validate sandboxName as before and keep the
existing logging around applying the permissive policy.
In `@src/nemoclaw.ts`:
- Around line 799-814: The CLI validation rejects the built-in "openclaw"
because listAgents() only returns discovered agents; update the validation in
the agent parsing block (variables agentIdx, agentFlag and the call to
listAgents) to allow "openclaw" as a valid agent name (e.g., treat agentFlag ===
"openclaw" as accepted) or add "openclaw" to the knownAgents list before the
includes() check; ensure args slicing behavior remains the same so downstream
agentOnboard.resolveAgent() still receives the intended value.
---
Outside diff comments:
In `@src/lib/registry.ts`:
- Around line 150-162: The registry currently omits the
dangerouslySkipPermissions flag when persisting new sandboxes in
registerSandbox, causing sandboxes created with --dangerously-skip-permissions
to lose that metadata; update the object written in registerSandbox (inside
withLock) to copy entry.dangerouslySkipPermissions (defaulting to false if
undefined) into the stored sandbox record so the persisted entry includes
dangerouslySkipPermissions alongside name, model, provider, agent, etc.; ensure
you also adjust any related type/interface (SandboxEntry or stored sandbox type)
if necessary so the field is allowed.
In `@src/nemoclaw.ts`:
- Around line 329-342: buildRecoveredSandboxEntry currently preserves
metadata.agent but the session-based registry rebuild path recreates recovery
metadata without including session.agent, causing recovered Hermes sandboxes to
be registered with agent: null; update the session-seeding/rebuild logic to copy
session.agent into the metadata used to construct recovered entries (i.e.,
ensure the session object’s agent property is passed through to the metadata
argument before calling buildRecoveredSandboxEntry) so that
buildRecoveredSandboxEntry receives and persists the agent field for later
health probes and recovery commands.
---
Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 4301-4307: Resolve the requested agent earlier: call
agentOnboard.resolveAgent({ agentFlag: opts.agent, session }) before invoking
getResumeConfigConflicts() and the resume conflict gate, then if an agent was
specified ensure it matches the existing session.agent (or reject the resume)
rather than updating session.agent mid-resume; if the resolved agent conflicts
with the current session (or the user requested a different agent) throw or
return the resume-mismatch error instead of calling onboardSession.updateSession
to overwrite session.agent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 96690526-aa48-4b01-ae60-f840540ba4f0
📒 Files selected for processing (6)
nemoclaw-blueprint/policies/openclaw-sandbox-permissive.yamlsrc/lib/onboard.tssrc/lib/policies.tssrc/lib/registry.tssrc/nemoclaw.tstest/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/onboard.test.ts
| let agentFlag = null; | ||
| const agentIdx = args.indexOf("--agent"); | ||
| if (agentIdx !== -1) { | ||
| agentFlag = args[agentIdx + 1]; | ||
| if (!agentFlag || agentFlag.startsWith("--")) { | ||
| console.error(" --agent requires a name (e.g. openclaw, hermes)"); | ||
| process.exit(1); | ||
| } | ||
| const { listAgents } = require("../bin/lib/agent-defs"); | ||
| const knownAgents = listAgents(); | ||
| if (!knownAgents.includes(agentFlag)) { | ||
| console.error(` Unknown agent '${agentFlag}'. Available: ${knownAgents.join(", ")}`); | ||
| process.exit(1); | ||
| } | ||
| args = [...args.slice(0, agentIdx), ...args.slice(agentIdx + 2)]; | ||
| } |
There was a problem hiding this comment.
Allow the built-in OpenClaw agent through CLI validation.
agentOnboard.resolveAgent() treats "openclaw" as the built-in default, but this branch only accepts names returned by listAgents(), which comes from discovered agents/*/manifest.yaml directories. That means nemoclaw onboard --agent openclaw is rejected even though the CLI advertises it.
Suggested fix
const { listAgents } = require("../bin/lib/agent-defs");
- const knownAgents = listAgents();
+ const knownAgents = ["openclaw", ...listAgents()];
if (!knownAgents.includes(agentFlag)) {
console.error(` Unknown agent '${agentFlag}'. Available: ${knownAgents.join(", ")}`);
process.exit(1);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let agentFlag = null; | |
| const agentIdx = args.indexOf("--agent"); | |
| if (agentIdx !== -1) { | |
| agentFlag = args[agentIdx + 1]; | |
| if (!agentFlag || agentFlag.startsWith("--")) { | |
| console.error(" --agent requires a name (e.g. openclaw, hermes)"); | |
| process.exit(1); | |
| } | |
| const { listAgents } = require("../bin/lib/agent-defs"); | |
| const knownAgents = listAgents(); | |
| if (!knownAgents.includes(agentFlag)) { | |
| console.error(` Unknown agent '${agentFlag}'. Available: ${knownAgents.join(", ")}`); | |
| process.exit(1); | |
| } | |
| args = [...args.slice(0, agentIdx), ...args.slice(agentIdx + 2)]; | |
| } | |
| let agentFlag = null; | |
| const agentIdx = args.indexOf("--agent"); | |
| if (agentIdx !== -1) { | |
| agentFlag = args[agentIdx + 1]; | |
| if (!agentFlag || agentFlag.startsWith("--")) { | |
| console.error(" --agent requires a name (e.g. openclaw, hermes)"); | |
| process.exit(1); | |
| } | |
| const { listAgents } = require("../bin/lib/agent-defs"); | |
| const knownAgents = ["openclaw", ...listAgents()]; | |
| if (!knownAgents.includes(agentFlag)) { | |
| console.error(` Unknown agent '${agentFlag}'. Available: ${knownAgents.join(", ")}`); | |
| process.exit(1); | |
| } | |
| args = [...args.slice(0, agentIdx), ...args.slice(agentIdx + 2)]; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/nemoclaw.ts` around lines 799 - 814, The CLI validation rejects the
built-in "openclaw" because listAgents() only returns discovered agents; update
the validation in the agent parsing block (variables agentIdx, agentFlag and the
call to listAgents) to allow "openclaw" as a valid agent name (e.g., treat
agentFlag === "openclaw" as accepted) or add "openclaw" to the knownAgents list
before the includes() check; ensure args slicing behavior remains the same so
downstream agentOnboard.resolveAgent() still receives the intended value.
…missive sandbox mode" This reverts commit e06b7c7. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
src/lib/onboard.ts (1)
4287-4293:⚠️ Potential issue | 🟠 MajorValidate agent compatibility before allowing resume.
Agent resolution and
session.agentmutation happen after resume conflict checks. A resume run can still attach with a different agent and overwrite recorded session state.Suggested fix direction
- const resumeConflicts = getResumeConfigConflicts(session, { + const resolvedAgent = agentOnboard.resolveAgent({ agentFlag: opts.agent, session }); + const resumeConflicts = getResumeConfigConflicts(session, { nonInteractive: isNonInteractive(), fromDockerfile: requestedFromDockerfile, + requestedAgent: resolvedAgent ? resolvedAgent.name : null, });function getResumeConfigConflicts(session, opts = {}) { const conflicts = []; + if (opts.requestedAgent && session?.agent && opts.requestedAgent !== session.agent) { + conflicts.push({ + field: "agent", + requested: opts.requestedAgent, + recorded: session.agent, + }); + } // ... }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 4287 - 4293, Ensure we validate agent compatibility before allowing a resume: when resolving an agent via agentOnboard.resolveAgent({ agentFlag: opts.agent, session }), check if session.agent already exists and if so verify the resolved agent.name matches session.agent (or reject/throw/return an error) before calling onboardSession.updateSession to mutate session.agent; alternatively perform the resolve and compatibility check prior to any resume conflict checks so a resume cannot attach a different agent and overwrite recorded session state.src/nemoclaw.ts (1)
799-801:⚠️ Potential issue | 🟠 Major
--agent openclawcan still be rejected by CLI validation.The validator uses only discovered agents from
listAgents(). Ifopenclawis built-in (not discovered), this path rejects a documented/default value.Suggested fix
const { listAgents } = require("../bin/lib/agent-defs"); - const knownAgents = listAgents(); + const knownAgents = ["openclaw", ...listAgents()]; if (!knownAgents.includes(agentFlag)) { console.error(` Unknown agent '${agentFlag}'. Available: ${knownAgents.join(", ")}`); process.exit(1); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/nemoclaw.ts` around lines 799 - 801, The validation rejects documented built-in agents because knownAgents is only from listAgents(); update the check so built-in agents are accepted: either merge a built-in agents collection into knownAgents before the includes() call (e.g., const knownAgents = [...listAgents(), ...BUILTIN_AGENTS]) or change the conditional to also allow isBuiltInAgent(agentFlag) (or check BUILTIN_AGENTS.includes(agentFlag)); refer to listAgents, knownAgents, agentFlag and the built-in agent identifier (e.g., "openclaw") when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/nemoclaw.ts`:
- Around line 217-222: The probeUrl is interpolated into a shell command
unquoted which allows shell metacharacters to break or alter execution; in the
call that builds the command for executeSandboxCommand (using sandboxName and
probeUrl obtained via agentRuntime.getSessionAgent and
agentRuntime.getHealthProbeUrl), shell-escape or properly quote probeUrl before
interpolation (e.g., wrap in single quotes and escape any single quotes inside
probeUrl, or use a dedicated shell-escaping helper) so the final command is safe
and cannot be injected or broken by special characters.
- Line 341: buildRecoveredSandboxEntry is persisting metadata.agent but the
session seed object doesn't include session.agent, causing recovered entries to
end up with agent: null; update the recovery path so the agent is preserved by
either (A) adding session.agent when constructing the session seed (copy
metadata.agent into the seed) or (B) changing buildRecoveredSandboxEntry to
prefer metadata.agent || session.agent (fallback to session.agent) when setting
agent; locate uses of metadata.agent and the session seed construction and apply
one of these fixes so recovered entries always have a non-null agent.
---
Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 4287-4293: Ensure we validate agent compatibility before allowing
a resume: when resolving an agent via agentOnboard.resolveAgent({ agentFlag:
opts.agent, session }), check if session.agent already exists and if so verify
the resolved agent.name matches session.agent (or reject/throw/return an error)
before calling onboardSession.updateSession to mutate session.agent;
alternatively perform the resolve and compatibility check prior to any resume
conflict checks so a resume cannot attach a different agent and overwrite
recorded session state.
In `@src/nemoclaw.ts`:
- Around line 799-801: The validation rejects documented built-in agents because
knownAgents is only from listAgents(); update the check so built-in agents are
accepted: either merge a built-in agents collection into knownAgents before the
includes() call (e.g., const knownAgents = [...listAgents(), ...BUILTIN_AGENTS])
or change the conditional to also allow isBuiltInAgent(agentFlag) (or check
BUILTIN_AGENTS.includes(agentFlag)); refer to listAgents, knownAgents, agentFlag
and the built-in agent identifier (e.g., "openclaw") when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a3a8e3db-2ba7-4d02-ad8b-b3c47796775d
📒 Files selected for processing (4)
src/lib/onboard.tssrc/lib/registry.tssrc/nemoclaw.tstest/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- test/onboard.test.ts
| ? metadata.policyPresets | ||
| : [], | ||
| nimContainer: metadata.nimContainer || null, | ||
| agent: metadata.agent || null, |
There was a problem hiding this comment.
Session-based recovery can drop agent metadata.
buildRecoveredSandboxEntry persists metadata.agent, but the session seed object does not include session.agent. Recovered entries can end up with agent: null, which can misroute agent-specific health/recovery behavior.
Suggested fix
metadataByName.set(
session.sandboxName,
buildRecoveredSandboxEntry(session.sandboxName, {
model: session.model || null,
provider: session.provider || null,
nimContainer: session.nimContainer || null,
policyPresets: session.policyPresets || null,
+ agent: session.agent || null,
}),
);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/nemoclaw.ts` at line 341, buildRecoveredSandboxEntry is persisting
metadata.agent but the session seed object doesn't include session.agent,
causing recovered entries to end up with agent: null; update the recovery path
so the agent is preserved by either (A) adding session.agent when constructing
the session seed (copy metadata.agent into the seed) or (B) changing
buildRecoveredSandboxEntry to prefer metadata.agent || session.agent (fallback
to session.agent) when setting agent; locate uses of metadata.agent and the
session seed construction and apply one of these fixes so recovered entries
always have a non-null agent.
…ions Both features (agent selection and dangerouslySkipPermissions) are kept. The createSandbox function now accepts both parameters, and policy path resolution considers dangerouslySkipPermissions before falling back to agent-specific policies. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Address cv review feedback on PR #1618: - Move agent-defs, agent-onboard, agent-runtime from bin/lib/ (CJS) to src/lib/ (TypeScript) following the established migration pattern - Leave thin re-export shims in bin/lib/ pointing to dist/lib/ - Bump Hermes plugin version to 0.0.11 Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
- Remove || true that masked plugin installation failures during Docker build; add -r flag for recursive copy - Drop back to USER sandbox before ENTRYPOINT so the container doesn't run as root by default - Tighten E2E session validation to match exact "agent": "hermes" key-value pair instead of two separate grep checks Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
- Quote probeUrl with shellQuote() in nemoclaw.ts health check to prevent shell metacharacter injection - Import shellQuote in agent-runtime.ts and quote probeUrl in the recovery script - Check absolute binary_path first in recovery script before falling back to command -v basename lookup Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Instead of logging a warning and marking the step complete, throw an error so the user knows the agent gateway didn't start. Also adds --max-time 3 to the loop probe to match the resume probe behavior. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…issions - Ensure --dangerously-skip-permissions always uses the permissive policy even when an agent is selected, instead of being overridden by the agent-specific policy - Add Hermes filesystem paths and Nous Research endpoints to the permissive policy so it covers all agents - Detect agent conflicts during --resume to prevent attaching a different agent to an existing session Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Instead of hardcoding CONTROL_UI_PORT (18789) for all sandboxes, derive the effective port from agent.forwardPort so Hermes (8642) and other agents get the correct port forwarded. Also cleans up both known forward ports during gateway teardown. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…issions Each agent now ships its own policy-permissive.yaml scoped to its specific endpoints, filesystem paths, and package registries: - OpenClaw permissive: clawhub.ai, openclaw.ai, npm, .openclaw-data - Hermes permissive: nousresearch.com, PyPI, .hermes-data The resolution chain is: agent-specific permissive policy (if exists) then global fallback. applyPermissivePolicy() in policies.ts now looks up the sandbox's agent to resolve the correct variant. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…ermes to 90s The hardcoded 15-attempt × 2s loop (30s) was too short for CI runners. Now reads timeout_seconds from the agent manifest and polls every 3s. Hermes timeout bumped from 30s to 90s to accommodate slower environments. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
The Hermes gateway takes longer to start than the onboard probe allows. The passing E2E runs show the gateway eventually starts after onboarding completes. Revert to warning behavior so onboarding can finish and the E2E test handles its own gateway verification. Also revert USER sandbox in Dockerfile — start.sh requires root for symlink validation, hardening, and gosu-based privilege separation. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…licy The global fallback permissive policy should not expose agent-specific endpoints. Hermes endpoints (nousresearch.com, .hermes-data) already live in agents/hermes/policy-permissive.yaml where they belong. Also remove unconditional forward stop for port 8642 in gateway cleanup — agent-specific ports are already handled by ensureSandboxPortForward which resolves the agent dynamically. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
…feat/hermes-agent-support
Remove --dangerously-skip-permissions from the help text and strip both --agent and --dangerously-skip-permissions from usage strings shown on bad flag errors. Also make the --agent error message generic instead of naming specific agents. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
## Summary
- Installer `print_done()` hardcoded "Your OpenClaw Sandbox is live" and
`openclaw tui` regardless of which agent was onboarded. Now reads the
agent from the onboard session and adjusts the completion message
accordingly — non-OpenClaw agents get their own name and no `openclaw
tui` hint.
- Adds missing `agent_setup` step to `defaultSteps()` in
`onboard-session.ts`. Without it, `markStepStarted("agent_setup")`
silently no-ops because the step key doesn't exist in the session,
making the Hermes setup step invisible to resume logic and session
tracking.
## Test plan
- [ ] `NEMOCLAW_INSTALL_REF=main NEMOCLAW_AGENT=hermes curl -fsSL
.../install.sh | bash` — verify completion message says "Your Hermes
Sandbox is live" and does not show `openclaw tui`
- [ ] Same flow with default OpenClaw agent — verify message still says
"Your OpenClaw Sandbox is live" with `openclaw tui`
- [ ] `npm test` passes (1277 tests, 0 failures)
Closes #1618 (partial — installer integration for multi-agent onboard)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Onboarding completion messages now show the configured agent name
(defaults to OpenClaw).
* Onboarding session now tracks an "agent setup" step.
* Next-step instructions are tailored per agent and omitted when not
applicable.
* **Bug Fixes / Improvements**
* Installer skips pre-extraction for non-OpenClaw agents and updates
progress text to reflect preparing agent dependencies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
## Summary - Introduces an `agents/` directory structure for agent-specific artifacts (Dockerfiles, startup scripts, network policies, plugins) - Adds agent selection step to the onboard wizard (`--agent hermes` or interactive prompt) - Includes a working Hermes Agent integration as the first non-OpenClaw agent ## What this enables NemoClaw can now orchestrate different AI agents inside OpenShell sandboxes. The onboard flow lets users choose which agent to run, and routes to the correct container image, network policy, and agent setup based on that choice. ## Test plan - [ ] `nemoclaw onboard --agent hermes` completes end-to-end - [ ] `nemoclaw onboard` (no flag) shows agent selection prompt, defaults to OpenClaw - [ ] Hermes health endpoint responds inside sandbox - [ ] OpenClaw path is unaffected (regression) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Added support for Hermes Agent (Nous Research) as an alternative sandbox option alongside OpenClaw. * Introduced `--agent <name>` flag to `nemoclaw onboard` for selecting the desired agent during setup. * Agent selection is now persisted in user sessions and sandbox configurations. * **Updates** * Onboarding flow expanded to 8 steps to accommodate agent selection. * Gateway health monitoring and recovery now adapt based on the selected agent. * Updated messaging in warnings and logs to be agent-agnostic. * **Tests** * Added comprehensive end-to-end tests for Hermes Agent integration. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Aaron Erickson <aerickson@nvidia.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
## Summary
- Installer `print_done()` hardcoded "Your OpenClaw Sandbox is live" and
`openclaw tui` regardless of which agent was onboarded. Now reads the
agent from the onboard session and adjusts the completion message
accordingly — non-OpenClaw agents get their own name and no `openclaw
tui` hint.
- Adds missing `agent_setup` step to `defaultSteps()` in
`onboard-session.ts`. Without it, `markStepStarted("agent_setup")`
silently no-ops because the step key doesn't exist in the session,
making the Hermes setup step invisible to resume logic and session
tracking.
## Test plan
- [ ] `NEMOCLAW_INSTALL_REF=main NEMOCLAW_AGENT=hermes curl -fsSL
.../install.sh | bash` — verify completion message says "Your Hermes
Sandbox is live" and does not show `openclaw tui`
- [ ] Same flow with default OpenClaw agent — verify message still says
"Your OpenClaw Sandbox is live" with `openclaw tui`
- [ ] `npm test` passes (1277 tests, 0 failures)
Closes NVIDIA#1618 (partial — installer integration for multi-agent onboard)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Onboarding completion messages now show the configured agent name
(defaults to OpenClaw).
* Onboarding session now tracks an "agent setup" step.
* Next-step instructions are tailored per agent and omitted when not
applicable.
* **Bug Fixes / Improvements**
* Installer skips pre-extraction for non-OpenClaw agents and updates
progress text to reflect preparing agent dependencies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
## Summary - Introduces an `agents/` directory structure for agent-specific artifacts (Dockerfiles, startup scripts, network policies, plugins) - Adds agent selection step to the onboard wizard (`--agent hermes` or interactive prompt) - Includes a working Hermes Agent integration as the first non-OpenClaw agent ## What this enables NemoClaw can now orchestrate different AI agents inside OpenShell sandboxes. The onboard flow lets users choose which agent to run, and routes to the correct container image, network policy, and agent setup based on that choice. ## Test plan - [ ] `nemoclaw onboard --agent hermes` completes end-to-end - [ ] `nemoclaw onboard` (no flag) shows agent selection prompt, defaults to OpenClaw - [ ] Hermes health endpoint responds inside sandbox - [ ] OpenClaw path is unaffected (regression) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Added support for Hermes Agent (Nous Research) as an alternative sandbox option alongside OpenClaw. * Introduced `--agent <name>` flag to `nemoclaw onboard` for selecting the desired agent during setup. * Agent selection is now persisted in user sessions and sandbox configurations. * **Updates** * Onboarding flow expanded to 8 steps to accommodate agent selection. * Gateway health monitoring and recovery now adapt based on the selected agent. * Updated messaging in warnings and logs to be agent-agnostic. * **Tests** * Added comprehensive end-to-end tests for Hermes Agent integration. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Aaron Erickson <aerickson@nvidia.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
## Summary
- Installer `print_done()` hardcoded "Your OpenClaw Sandbox is live" and
`openclaw tui` regardless of which agent was onboarded. Now reads the
agent from the onboard session and adjusts the completion message
accordingly — non-OpenClaw agents get their own name and no `openclaw
tui` hint.
- Adds missing `agent_setup` step to `defaultSteps()` in
`onboard-session.ts`. Without it, `markStepStarted("agent_setup")`
silently no-ops because the step key doesn't exist in the session,
making the Hermes setup step invisible to resume logic and session
tracking.
## Test plan
- [ ] `NEMOCLAW_INSTALL_REF=main NEMOCLAW_AGENT=hermes curl -fsSL
.../install.sh | bash` — verify completion message says "Your Hermes
Sandbox is live" and does not show `openclaw tui`
- [ ] Same flow with default OpenClaw agent — verify message still says
"Your OpenClaw Sandbox is live" with `openclaw tui`
- [ ] `npm test` passes (1277 tests, 0 failures)
Closes NVIDIA#1618 (partial — installer integration for multi-agent onboard)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Onboarding completion messages now show the configured agent name
(defaults to OpenClaw).
* Onboarding session now tracks an "agent setup" step.
* Next-step instructions are tailored per agent and omitted when not
applicable.
* **Bug Fixes / Improvements**
* Installer skips pre-extraction for non-OpenClaw agents and updates
progress text to reflect preparing agent dependencies.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Summary
agents/directory structure for agent-specific artifacts (Dockerfiles, startup scripts, network policies, plugins)--agent hermesor interactive prompt)What this enables
NemoClaw can now orchestrate different AI agents inside OpenShell sandboxes. The onboard flow lets users choose which agent to run, and routes to the correct container image, network policy, and agent setup based on that choice.
Test plan
nemoclaw onboard --agent hermescompletes end-to-endnemoclaw onboard(no flag) shows agent selection prompt, defaults to OpenClawSummary by CodeRabbit
Release Notes
New Features
--agent <name>flag tonemoclaw onboardfor selecting the desired agent during setup.Updates
Tests