Skip to content

feat(cli): add nemoclaw <sandbox> skill install command (#1844)#1845

Merged
ericksoa merged 7 commits intoNVIDIA:mainfrom
senthilr-nv:feat/skill-install
Apr 14, 2026
Merged

feat(cli): add nemoclaw <sandbox> skill install command (#1844)#1845
ericksoa merged 7 commits intoNVIDIA:mainfrom
senthilr-nv:feat/skill-install

Conversation

@senthilr-nv
Copy link
Copy Markdown
Contributor

@senthilr-nv senthilr-nv commented Apr 13, 2026

Summary

  • Add nemoclaw <sandbox> skill install <path> to deploy a skill directory to a running sandbox over SSH.
  • Validate YAML frontmatter via the yaml library (name: required, safe-character enforcement). Malformed YAML is rejected before upload.
  • Upload all non-dot regular files in the skill directory (for example SKILL.md and scripts/), preserving subdirectory structure. Reject files with unsafe path characters, and fail the install if any primary upload fails. OpenClaw mirror failures are warning-only.
  • Manifest-driven upload path: writes to {immutableDir}/skills/{name}/, which symlinks through to writable backing store.
  • OpenClaw-specific post-install: mirror to $HOME/.openclaw/skills/ and clear sessions.json for skill re-discovery. Non-OpenClaw agents get a "restart gateway" hint.
  • Detect existing skill to differentiate "installed" vs "updated"; skip session clearing on updates to preserve chat history.

Security

  • Filename validation: All relative paths validated against [A-Za-z0-9._-/] before upload. Files with shell metacharacters are rejected with a hard error, preventing shell injection via crafted filenames.
  • Shell quoting: All remote path interpolation uses shellQuote() (from runner.ts) as defense in depth.
  • Dotfile filtering: Dotfiles (.env, .npmrc, etc.) are excluded from upload by default. Skipped dotfiles are listed so users are aware.

Test Plan

  • 24 unit tests across 5 describe blocks: frontmatter parsing, path resolution, validateRelativePath, shellQuote, collectFiles
  • Validated with a sample NASA APOD skill (markdown-only, uses curl + public API) on DGX Spark
  • Verified: file presence + symlink + ownership + agent discovery + chat usage + update-preserves-sessions

Key Design Decisions

  • Full directory upload — not just SKILL.md. Supports richer skills with sibling files (e.g. scripts/ alongside SKILL.md), matching the pattern used by existing multi-file skill directories in the E2E tests.
  • Mirror is OpenClaw-only — not manifest-driven. Other agents (Hermes) resolve skills solely from immutableDir.
  • Session refresh is OpenClaw-onlysessions.json path is an OpenClaw internal layout, not a manifest contract.
  • $HOME expansion — mirror path uses $HOME in double-quoted SSH commands for dynamic resolution. Safe because filenames are pre-validated to the strict charset.
  • skipRefresh on updates — re-installing a skill should not destroy chat history.

Closes #1844

Summary by CodeRabbit

  • New Features

    • Added sandbox "skill install" command: installs from SKILL.md, enforces frontmatter and safe relative paths, uploads files over SSH preserving structure, reports skipped dotfiles and unsafe paths, performs agent-specific post-install follow-up and verification, and updated CLI help.
    • Agents can now trigger a "reload skills" action and report discovered skills; startup/tool listings reflect this.
  • Tests

    • Added tests covering frontmatter parsing, path validation, file collection, install/upload flow, and agent-specific path resolution.

Add `nemoclaw <sandbox> skill install <path>` to deploy a local
SKILL.md to a running sandbox over SSH.

- Validate YAML frontmatter (name: required, safe-character enforcement)
- Manifest-driven upload path: {immutableDir}/skills/{name}/
- OpenClaw-specific post-install: mirror to $HOME/.openclaw/skills/
  and clear sessions.json for skill re-discovery
- Non-OpenClaw agents get a "restart gateway" hint
- Detect existing skill for install-vs-update; skip session clearing
  on updates to preserve chat history
- 14 unit tests for frontmatter parsing and path resolution
- Proof-of-life on DGX Spark with NASA APOD skill

Signed-off-by: Senthil Ravichandran <senthilr@nvidia.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 13, 2026

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b632e214-f5e4-42c4-b5d3-9f7533560c92

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a sandbox-scoped "skill install" command, a new skill-install library (frontmatter parsing, path validation, SSH upload primitives, directory collection/upload, post-install mirror/session handling, verification), and a Vitest suite; integrates the command into the CLI dispatch and help text.

Changes

Cohort / File(s) Summary
Skill Install Library
src/lib/skill-install.ts
New module implementing SKILL.md frontmatter parsing/validation, safe relative-path checks, sshExec/uploadFile primitives, recursive collectFiles/uploadDirectory, agent-aware resolveSkillPaths, postInstall mirror/session logic, and install verification helpers.
Skill Install Tests
src/lib/skill-install.test.ts
New Vitest suite exercising parseFrontmatter (valid/invalid cases), validateRelativePath, shellQuote, collectFiles behavior (dotfile skipping, unsafe path detection), and resolveSkillPaths across agent scenarios.
CLI Integration
src/nemoclaw.ts
Adds sandboxSkillInstall handler and dispatch for nemoclaw <sandbox> skill install <path>: argument parsing, SKILL.md resolution, file collection and safety checks, temporary SSH config management, upload flow, postInstall invocation, verification, and help text updates.

Sequence Diagram(s)

sequenceDiagram
    actor User as User/CLI
    participant CLI as nemoclaw
    participant Lib as skill-install
    participant SSH as openshell-ssh
    participant Sandbox as Sandbox

    User->>CLI: nemoclaw <sandbox> skill install <path>
    CLI->>Lib: parseFrontmatter(SKILL.md)
    Lib-->>CLI: skillName
    CLI->>Lib: collectFiles(localDir)
    Lib-->>CLI: files[], skippedDotfiles[], unsafePaths[]
    CLI->>Lib: resolveSkillPaths(agent, skillName)
    CLI->>SSH: write temp ssh-config & sshExec(checkExisting)
    SSH->>Sandbox: stat remote SKILL.md
    Sandbox-->>SSH: exists?/not found
    SSH-->>CLI: result
    CLI->>Lib: uploadDirectory(sshCtx, localDir, remoteDir)
    loop per safe file
      CLI->>SSH: uploadFile(pipe content)
      SSH->>Sandbox: mkdirs + write file
      Sandbox-->>SSH: ok
    end
    CLI->>Lib: postInstall(sshCtx, paths, localDir, opts)
    Lib->>SSH: mirror files / clear sessions (if OpenClaw)
    SSH->>Sandbox: run mirror/clear commands
    SSH-->>Lib: results
    CLI->>SSH: verifyInstall()
    SSH->>Sandbox: confirm SKILL.md exists
    Sandbox-->>SSH: confirmed
    SSH-->>CLI: success
    CLI-->>User: Skill installed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped through frontmatter, parsed each name,
Skipped dotty burrows and shunned paths of shame.
I piped up files through an SSH trail,
Mirrored and nudged the agent's tale.
A tiny rabbit cheers — installs without fail!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 64.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a new CLI command for skill installation in sandboxes.
Linked Issues check ✅ Passed The PR implements all coding requirements from issue #1844: validates SKILL.md frontmatter, uploads skill directories securely, triggers agent re-discovery, reads paths from agent manifests, and provides clear feedback messages.
Out of Scope Changes check ✅ Passed All changes directly support the skill install command and its requirements; no unrelated modifications to other features or unnecessary refactoring detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@senthilr-nv senthilr-nv marked this pull request as ready for review April 13, 2026 21:06
@senthilr-nv senthilr-nv requested review from ericksoa and kjw3 April 13, 2026 21:08
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/lib/skill-install.ts`:
- Line 316: Replace the manual single-quoting of paths.sessionFile when building
the ssh command—change the sshExec call that creates refreshResult (currently:
sshExec(ctx, `echo '{}' > '${paths.sessionFile}'`)) to use
shellQuote(paths.sessionFile) inside the template (e.g. `echo '{}' >
${shellQuote(paths.sessionFile)}`) and add/import shellQuote at the top of the
module if missing so the path is properly escaped for the shell.
- Around line 196-199: The current upload reads files with
fs.readFileSync(localPath, "utf-8") which decodes bytes and corrupts binaries;
change it to fs.readFileSync(localPath) so content is a Buffer, update sshExec
to accept Buffer input (preserve binary when streaming to the remote cat > ...),
and likewise remove any "utf-8" encoding when reading files in the postInstall
mirror loop so those reads return Buffers; reference the variables/function
names content, remotePath, script and the sshExec call in
src/lib/skill-install.ts and ensure sshExec's input handling treats Buffer
correctly.

In `@src/nemoclaw.ts`:
- Line 1244: Update the help text string currently logged ("  or a direct path
to a SKILL.md file. All files in the directory are uploaded.") so it accurately
states that dotfiles are skipped—for example change it to indicate "All non-dot
files in the directory are uploaded" or "dotfiles are skipped"; modify the
console.log call in the same location in nemoclaw.ts (the line emitting that
exact string) to use the revised wording.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8fec25e3-ba30-44e1-b278-4123df86a805

📥 Commits

Reviewing files that changed from the base of the PR and between a064e97 and 424e923.

📒 Files selected for processing (3)
  • src/lib/skill-install.test.ts
  • src/lib/skill-install.ts
  • src/nemoclaw.ts

Comment thread src/lib/skill-install.ts Outdated
Comment thread src/lib/skill-install.ts Outdated
Comment thread src/nemoclaw.ts Outdated
- Accept Buffer in sshExec input to preserve binary file integrity
- Remove utf-8 encoding from readFileSync in uploadFile and postInstall
  mirror to prevent corruption of non-UTF-8 files
- Use shellQuote for sessionFile path instead of manual single quotes
- Update help text to clarify dotfiles are excluded from upload

Signed-off-by: Senthil Ravichandran <senthilr@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/lib/skill-install.ts (1)

19-22: Preserve the full parsed frontmatter here.

parseFrontmatter() is exported and documented as returning the parsed mapping, but Line 71 drops every key except name. That makes the helper surprisingly lossy for any future caller that needs other validated metadata.

♻️ Proposed refactor
-  return { name: nameValue };
+  return { ...fm, name: nameValue };

Also applies to: 24-30, 71-71

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/skill-install.ts` around lines 19 - 22, The parseFrontmatter
implementation and SkillFrontmatter should preserve and return the entire parsed
frontmatter object instead of only projecting the name property; update the
SkillFrontmatter typing (interface SkillFrontmatter) to allow arbitrary keys
(already has [key: string]: unknown) and change parseFrontmatter to return the
full parsed mapping (e.g., the original parsed object or a shallow-validated
copy) rather than { name }, ensuring callers receive all validated metadata;
reference the parseFrontmatter function and SkillFrontmatter interface when
making this change so tests and call sites continue to receive the complete
frontmatter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/nemoclaw.ts`:
- Around line 1255-1260: The skill install handler currently only reads a single
positional argument (skillPath) and ignores any additional trailing args; update
the handler in src/nemoclaw.ts so that after reading const skillPath = args[1]
you validate that args.length === 2 (or that args.slice(2) is empty) and if not,
print a clear usage/error message and exit with process.exit(1). Reference the
existing symbols skillPath and the usage messages to ensure the new check runs
before proceeding with installation.

---

Nitpick comments:
In `@src/lib/skill-install.ts`:
- Around line 19-22: The parseFrontmatter implementation and SkillFrontmatter
should preserve and return the entire parsed frontmatter object instead of only
projecting the name property; update the SkillFrontmatter typing (interface
SkillFrontmatter) to allow arbitrary keys (already has [key: string]: unknown)
and change parseFrontmatter to return the full parsed mapping (e.g., the
original parsed object or a shallow-validated copy) rather than { name },
ensuring callers receive all validated metadata; reference the parseFrontmatter
function and SkillFrontmatter interface when making this change so tests and
call sites continue to receive the complete frontmatter.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 81595080-fe32-48b2-b67d-1b64f6b1d6c6

📥 Commits

Reviewing files that changed from the base of the PR and between 424e923 and a77e421.

📒 Files selected for processing (2)
  • src/lib/skill-install.ts
  • src/nemoclaw.ts

Comment thread src/nemoclaw.ts
Prevents typos in scripted usage from silently succeeding.

Signed-off-by: Senthil Ravichandran <senthilr@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/nemoclaw.ts (2)

1271-1272: Remove TypeScript type annotations for consistency.

The file uses // @ts-nocheck`` and CommonJS require() throughout without any type annotations. These inline type hints are inconsistent with the rest of the codebase style.

♻️ Suggested fix
-  let skillDir: string;
-  let skillMdPath: string;
+  let skillDir;
+  let skillMdPath;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1271 - 1272, Remove the inline TypeScript type
annotations from the variable declarations to match the file's // `@ts-nocheck`
and CommonJS style: locate the declarations of skillDir and skillMdPath (the let
skillDir: string; and let skillMdPath: string; lines) and remove the ": string"
annotations so they are plain JavaScript declarations (e.g., let skillDir; let
skillMdPath;), and scan nearby variable declarations in the same module for any
other stray TypeScript annotations to remove for consistency.

1236-1375: Consider extracting path validation logic to reduce complexity.

The function has approximately 20 decision points (at the guideline limit). Extracting the path resolution and validation logic (lines 1268-1289) into a helper function would improve readability and bring complexity closer to the target of 15.

As per coding guidelines: "Limit cyclomatic complexity to 20 in JavaScript/TypeScript files, with target of 15".

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/nemoclaw.ts`:
- Around line 1271-1272: Remove the inline TypeScript type annotations from the
variable declarations to match the file's // `@ts-nocheck` and CommonJS style:
locate the declarations of skillDir and skillMdPath (the let skillDir: string;
and let skillMdPath: string; lines) and remove the ": string" annotations so
they are plain JavaScript declarations (e.g., let skillDir; let skillMdPath;),
and scan nearby variable declarations in the same module for any other stray
TypeScript annotations to remove for consistency.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 1d0f0d47-cde3-4ffa-a16d-dda1381df66c

📥 Commits

Reviewing files that changed from the base of the PR and between a77e421 and 91a84d4.

📒 Files selected for processing (1)
  • src/nemoclaw.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/nemoclaw.ts`:
- Around line 1603-1605: The skill install flow currently only proceeds when
registry.getSandbox(cmd) succeeds and doesn't attempt the same stale-registry
recovery used by the connect path; update the dispatch so that if
registry.getSandbox(cmd) fails you call ensureLiveSandboxOrExit() (the same
recovery used by connect), then re-attempt registry.getSandbox(cmd) before
calling sandboxSkillInstall(cmd, actionArgs); ensure you reference
registry.getSandbox, ensureLiveSandboxOrExit, and sandboxSkillInstall so the
retry logic mirrors connect's behavior and prevents falling through to "Unknown
command".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 007b9348-f55c-4297-afac-413365e09f6a

📥 Commits

Reviewing files that changed from the base of the PR and between 91a84d4 and 8ece2ac.

📒 Files selected for processing (1)
  • src/nemoclaw.ts

Comment thread src/nemoclaw.ts
Extend the existing connect recovery path to also cover skill install.
If the local registry is stale but the sandbox is still live,
recoverRegistryEntries() is now attempted for both connect and skill
actions before falling through to "Unknown command".

Signed-off-by: Senthil Ravichandran <senthilr@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/nemoclaw.ts (1)

1272-1273: Consider removing explicit type annotations in a @ts-nocheck file.

Lines 1272-1273 use TypeScript type annotations (let skillDir: string;), but the file starts with // @ts-nocheck``. These annotations provide no type safety in this context and introduce inconsistency with the rest of the file which uses plain JavaScript-style declarations.

♻️ Suggested simplification
-  let skillDir: string;
-  let skillMdPath: string;
+  let skillDir;
+  let skillMdPath;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1272 - 1273, Remove the redundant TypeScript
type annotations in this `@ts-nocheck` file by changing the declarations for
skillDir and skillMdPath to plain JavaScript declarations (e.g., remove ":
string") so they match the rest of the file's style; update the lines containing
the symbols skillDir and skillMdPath to use simple let declarations without type
annotations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/nemoclaw.ts`:
- Around line 1272-1273: Remove the redundant TypeScript type annotations in
this `@ts-nocheck` file by changing the declarations for skillDir and skillMdPath
to plain JavaScript declarations (e.g., remove ": string") so they match the
rest of the file's style; update the lines containing the symbols skillDir and
skillMdPath to use simple let declarations without type annotations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 427b0fc0-15dc-4e72-b105-53d734a92dbc

📥 Commits

Reviewing files that changed from the base of the PR and between 8ece2ac and 5737a1e.

📒 Files selected for processing (1)
  • src/nemoclaw.ts

Hermes caches skill slash-commands in a module-global dict on first
scan, making skills installed after gateway startup invisible. This
adds a nemoclaw_reload_skills tool that clears the cache and re-scans,
plus auto-refresh on session start, so new skills are available without
a gateway restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
agents/hermes/plugin/__init__.py (1)

217-220: Consider surfacing auto-reload failure at session start.

Line 219 ignores _reload_skills() failure, so session-start auto-refresh can silently fail and still show stale tools. Consider emitting a one-line system warning when reload returns None.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/hermes/plugin/__init__.py` around lines 217 - 220, The call to
_reload_skills() at session start can fail silently; change the code to capture
its return value (e.g., result = _reload_skills()) and if it returns None emit a
one-line system warning so users know the auto-reload failed and tools may be
stale; use the existing logging/system messaging facility in this module (e.g.,
logger.warning or the module's system message function) and include brief
context mentioning auto-reload/skill cache and that commands may be stale.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/hermes/plugin/__init__.py`:
- Around line 125-133: The code assumes agent.skill_commands provides a private
dict _skill_commands and currently swallows all errors; change to defensively
access and clear the cache using something like cache = getattr(sc,
"_skill_commands", None) and only call cache.clear() if cache is not None and
has a clear() method (or is a dict), then call sc.scan_skill_commands(); replace
the broad except Exception: return None with an except Exception as e: that logs
the exception (using a module logger or logging.exception) then returns None so
hot-reload failures are visible; keep ImportError handling as before and
reference the module agent.skill_commands, attribute _skill_commands, and
function scan_skill_commands in your changes.

In `@src/nemoclaw.ts`:
- Around line 1576-1581: The recovery check currently only runs when args[0] ===
"connect" or "skill" so an implicit default connect (when args[0] is missing)
like in the usage "nemoclaw <sandbox>" is skipped; update the condition around
registry.getSandbox(cmd) to also treat a missing action (args[0] is
falsy/undefined) as "connect" so that validateName(cmd, "sandbox name") and
await recoverRegistryEntries({ requestedSandboxName: cmd }) run for implicit
connect attempts; make the same change for the second identical recovery block
that uses registry.getSandbox, validateName, and recoverRegistryEntries.
- Around line 1277-1279: The branch that treats a direct path as SKILL.md is too
permissive; change the condition to require the exact filename and ensure it's a
regular file by checking path.basename(resolvedPath) === "SKILL.md" and using
fs.statSync(resolvedPath).isFile() (or fs.promises.stat in async contexts)
before setting skillDir and skillMdPath; update the code that sets skillDir and
skillMdPath (the variables referenced in this block) to only run when both the
basename match and the stat confirms a file, and handle any stat errors
appropriately.

---

Nitpick comments:
In `@agents/hermes/plugin/__init__.py`:
- Around line 217-220: The call to _reload_skills() at session start can fail
silently; change the code to capture its return value (e.g., result =
_reload_skills()) and if it returns None emit a one-line system warning so users
know the auto-reload failed and tools may be stale; use the existing
logging/system messaging facility in this module (e.g., logger.warning or the
module's system message function) and include brief context mentioning
auto-reload/skill cache and that commands may be stale.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 5e1d5f32-4d26-4d5e-9014-de5903a3e3ba

📥 Commits

Reviewing files that changed from the base of the PR and between 8ece2ac and f015059.

📒 Files selected for processing (3)
  • agents/hermes/plugin/__init__.py
  • agents/hermes/plugin/plugin.yaml
  • src/nemoclaw.ts
✅ Files skipped from review due to trivial changes (1)
  • agents/hermes/plugin/plugin.yaml

Comment on lines +125 to +133
try:
import agent.skill_commands as sc

sc._skill_commands.clear()
return sc.scan_skill_commands()
except ImportError:
return None
except Exception:
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

fd -i 'skill_commands.py$' | while read -r f; do
  echo "== $f =="
  rg -n -C2 --type=py '_skill_commands|def scan_skill_commands|def get_skill_commands|def reset' "$f"
done

Repository: NVIDIA/NemoClaw

Length of output: 41


🏁 Script executed:

# Check if the file exists and read the relevant lines
cat -n agents/hermes/plugin/__init__.py 2>/dev/null | sed -n '120,140p'

Repository: NVIDIA/NemoClaw

Length of output: 765


🏁 Script executed:

# Search for skill_commands module - try different patterns
fd -i skill_commands || echo "--- No exact match, searching for 'skill_commands' in code ---"
rg -l 'skill_commands' --type=py 2>/dev/null | head -20

Repository: NVIDIA/NemoClaw

Length of output: 91


🏁 Script executed:

# Search for _skill_commands references
rg '_skill_commands' --type=py -n 2>/dev/null

Repository: NVIDIA/NemoClaw

Length of output: 436


🏁 Script executed:

# Find the agent/skill_commands module
fd -i 'skill_commands' . || echo "No match"
find . -name '*skill_commands*' -type f 2>/dev/null

Repository: NVIDIA/NemoClaw

Length of output: 41


🏁 Script executed:

# Search for scan_skill_commands definition
rg 'def scan_skill_commands|def get_skill_commands|_skill_commands\s*=' --type=py -B2 -A5 2>/dev/null

Repository: NVIDIA/NemoClaw

Length of output: 41


🏁 Script executed:

# Check if agent package exists and its structure
ls -la agent/ 2>/dev/null || echo "agent directory not found in root"
find . -path '*/agent/*' -name '*.py' | grep -i skill | head -20

Repository: NVIDIA/NemoClaw

Length of output: 92


🏁 Script executed:

# Read more context around the _reload_skills function
cat -n agents/hermes/plugin/__init__.py | sed -n '115,135p'

Repository: NVIDIA/NemoClaw

Length of output: 893


🏁 Script executed:

# Check for setup.py, pyproject.toml, requirements.txt mentioning 'agent' package
find . -name 'setup.py' -o -name 'pyproject.toml' -o -name 'requirements*.txt' | xargs grep -l 'agent' 2>/dev/null | head -10

Repository: NVIDIA/NemoClaw

Length of output: 41


🏁 Script executed:

# Check for any external dependencies or imports of 'agent'
rg 'from agent|import agent' --type=py 2>/dev/null | head -20

Repository: NVIDIA/NemoClaw

Length of output: 133


🏁 Script executed:

# Look at the function signature and docstring
cat -n agents/hermes/plugin/__init__.py | sed -n '110,135p'

Repository: NVIDIA/NemoClaw

Length of output: 1082


Harden reload error handling and private-cache access.

Line 128 directly accesses the private sc._skill_commands attribute without checking if it exists or is a dict. Line 132 swallows all exceptions, making hot-reload failures silent and hard to diagnose. Even though agent.skill_commands is an optional external dependency, once imported, the code should not assume a specific API shape without defensive checks.

Proposed fix
 def _reload_skills():
     """Clear the Hermes skill slash-command cache and re-scan skill directories.
 
     Hermes's ``agent.skill_commands`` module caches discovered skills in a
     module-global dict (``_skill_commands``).  ``get_skill_commands()`` only
     scans on first call, so skills installed after gateway startup are
     invisible.  We clear the dict and call ``scan_skill_commands()`` to force
     a fresh scan.
 
     Returns the dict of discovered skills, or None on failure.
     """
     try:
         import agent.skill_commands as sc
 
-        sc._skill_commands.clear()
+        cache = getattr(sc, "_skill_commands", None)
+        if isinstance(cache, dict):
+            cache.clear()
         return sc.scan_skill_commands()
     except ImportError:
         return None
-    except Exception:
+    except (AttributeError, TypeError):
+        return None
+    except Exception as exc:
+        # Preserve debuggability while keeping current API behavior.
+        print(f"_reload_skills: unexpected reload failure: {exc}")
         return None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
import agent.skill_commands as sc
sc._skill_commands.clear()
return sc.scan_skill_commands()
except ImportError:
return None
except Exception:
return None
try:
import agent.skill_commands as sc
cache = getattr(sc, "_skill_commands", None)
if isinstance(cache, dict):
cache.clear()
return sc.scan_skill_commands()
except ImportError:
return None
except (AttributeError, TypeError):
return None
except Exception as exc:
# Preserve debuggability while keeping current API behavior.
print(f"_reload_skills: unexpected reload failure: {exc}")
return None
🧰 Tools
🪛 Ruff (0.15.9)

[warning] 132-132: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agents/hermes/plugin/__init__.py` around lines 125 - 133, The code assumes
agent.skill_commands provides a private dict _skill_commands and currently
swallows all errors; change to defensively access and clear the cache using
something like cache = getattr(sc, "_skill_commands", None) and only call
cache.clear() if cache is not None and has a clear() method (or is a dict), then
call sc.scan_skill_commands(); replace the broad except Exception: return None
with an except Exception as e: that logs the exception (using a module logger or
logging.exception) then returns None so hot-reload failures are visible; keep
ImportError handling as before and reference the module agent.skill_commands,
attribute _skill_commands, and function scan_skill_commands in your changes.

Comment thread src/nemoclaw.ts
Comment on lines +1277 to +1279
} else if (fs.existsSync(resolvedPath) && resolvedPath.endsWith("SKILL.md")) {
skillDir = path.dirname(resolvedPath);
skillMdPath = resolvedPath;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Require exact SKILL.md filename and a regular file in direct-path mode.

endsWith("SKILL.md") also accepts names like mySKILL.md, and this branch doesn’t enforce file type. That weakens the contract that the direct path is specifically SKILL.md.

🔧 Suggested fix
-  } else if (fs.existsSync(resolvedPath) && resolvedPath.endsWith("SKILL.md")) {
+  } else if (
+    fs.existsSync(resolvedPath) &&
+    fs.statSync(resolvedPath).isFile() &&
+    path.basename(resolvedPath) === "SKILL.md"
+  ) {
     skillDir = path.dirname(resolvedPath);
     skillMdPath = resolvedPath;
   } else {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} else if (fs.existsSync(resolvedPath) && resolvedPath.endsWith("SKILL.md")) {
skillDir = path.dirname(resolvedPath);
skillMdPath = resolvedPath;
} else if (
fs.existsSync(resolvedPath) &&
fs.statSync(resolvedPath).isFile() &&
path.basename(resolvedPath) === "SKILL.md"
) {
skillDir = path.dirname(resolvedPath);
skillMdPath = resolvedPath;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1277 - 1279, The branch that treats a direct
path as SKILL.md is too permissive; change the condition to require the exact
filename and ensure it's a regular file by checking path.basename(resolvedPath)
=== "SKILL.md" and using fs.statSync(resolvedPath).isFile() (or fs.promises.stat
in async contexts) before setting skillDir and skillMdPath; update the code that
sets skillDir and skillMdPath (the variables referenced in this block) to only
run when both the basename match and the stat confirms a file, and handle any
stat errors appropriately.

Comment thread src/nemoclaw.ts
Comment on lines +1576 to +1581
// If the registry doesn't know this name but the action is connect or skill,
// attempt recovery — the sandbox may still be live with a stale registry.
if (!registry.getSandbox(cmd) && (args[0] === "connect" || args[0] === "skill")) {
validateName(cmd, "sandbox name");
await recoverRegistryEntries({ requestedSandboxName: cmd });
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Registry recovery misses implicit default connect (nemoclaw <sandbox>).

Recovery currently runs only when args[0] is explicitly "connect" or "skill", but dispatch later treats missing action as "connect". With stale registry entries, nemoclaw <sandbox> can still fall through to unknown command instead of attempting recovery.

🔧 Suggested fix
-  if (!registry.getSandbox(cmd) && (args[0] === "connect" || args[0] === "skill")) {
+  const requestedAction = args[0] || "connect";
+  if (!registry.getSandbox(cmd) && (requestedAction === "connect" || requestedAction === "skill")) {
     validateName(cmd, "sandbox name");
     await recoverRegistryEntries({ requestedSandboxName: cmd });
   }
   const sandbox = registry.getSandbox(cmd);
   if (sandbox) {
     validateName(cmd, "sandbox name");
-    const action = args[0] || "connect";
+    const action = requestedAction;
     const actionArgs = args.slice(1);

Also applies to: 1585-1586

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1576 - 1581, The recovery check currently only
runs when args[0] === "connect" or "skill" so an implicit default connect (when
args[0] is missing) like in the usage "nemoclaw <sandbox>" is skipped; update
the condition around registry.getSandbox(cmd) to also treat a missing action
(args[0] is falsy/undefined) as "connect" so that validateName(cmd, "sandbox
name") and await recoverRegistryEntries({ requestedSandboxName: cmd }) run for
implicit connect attempts; make the same change for the second identical
recovery block that uses registry.getSandbox, validateName, and
recoverRegistryEntries.

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the full diff for regressions — looks good.

CI: All green (unit tests, sandbox image builds amd64+arm64, E2E sandbox/gateway-isolation/port-overrides).

Regression check:

  • Existing switch cases in nemoclaw.ts untouched; new skill case added cleanly
  • Registry recovery refactor is actually an improvement (fixes a latent bug where --dangerously-skip-permissions was dropped on recovery-path connects)
  • Hermes plugin additions are additive with proper error handling — existing tools and hooks preserved
  • shellQuote re-export from runner.ts verified
  • No import conflicts

One nit (non-blocking): The comment in postInstall() says "shellQuote the relative part" but the mirror path uses bare double-quoted interpolation, not shellQuote. The code is safe (validateRelativePath restricts to [A-Za-z0-9._-/]), but the comment is misleading.

I also pushed a commit adding nemoclaw_reload_skills to the Hermes plugin — clears the skill slash-command cache and re-scans from disk so new skills are available without a gateway restart. The on_session_start hook also auto-refreshes at session boundaries.

@ericksoa ericksoa merged commit 23720fb into NVIDIA:main Apr 14, 2026
13 checks passed
ericksoa added a commit to cheese-head/NemoClaw that referenced this pull request Apr 14, 2026
…NVIDIA#1845)

## Summary

- Add `nemoclaw <sandbox> skill install <path>` to deploy a skill
directory to a running sandbox over SSH.
- Validate YAML frontmatter via the `yaml` library (`name:` required,
safe-character enforcement). Malformed YAML is rejected before upload.
- Upload all non-dot regular files in the skill directory (for example
`SKILL.md` and `scripts/`), preserving subdirectory structure. Reject
files with unsafe path characters, and fail the install if any primary
upload fails. OpenClaw mirror failures are warning-only.
- Manifest-driven upload path: writes to
`{immutableDir}/skills/{name}/`, which symlinks through to writable
backing store.
- OpenClaw-specific post-install: mirror to `$HOME/.openclaw/skills/`
and clear `sessions.json` for skill re-discovery. Non-OpenClaw agents
get a "restart gateway" hint.
- Detect existing skill to differentiate "installed" vs "updated"; skip
session clearing on updates to preserve chat history.

## Security

- **Filename validation:** All relative paths validated against
`[A-Za-z0-9._-/]` before upload. Files with shell metacharacters are
rejected with a hard error, preventing shell injection via crafted
filenames.
- **Shell quoting:** All remote path interpolation uses `shellQuote()`
(from `runner.ts`) as defense in depth.
- **Dotfile filtering:** Dotfiles (`.env`, `.npmrc`, etc.) are excluded
from upload by default. Skipped dotfiles are listed so users are aware.

## Test Plan

- [x] 24 unit tests across 5 describe blocks: frontmatter parsing, path
resolution, `validateRelativePath`, `shellQuote`, `collectFiles`
- [x] Validated with a sample NASA APOD skill (markdown-only, uses
`curl` + public API) on DGX Spark
- [x] Verified: file presence + symlink + ownership + agent discovery +
chat usage + update-preserves-sessions

## Key Design Decisions

- **Full directory upload** — not just SKILL.md. Supports richer skills
with sibling files (e.g. `scripts/` alongside `SKILL.md`), matching the
pattern used by existing multi-file skill directories in the E2E tests.
- **Mirror is OpenClaw-only** — not manifest-driven. Other agents
(Hermes) resolve skills solely from `immutableDir`.
- **Session refresh is OpenClaw-only** — `sessions.json` path is an
OpenClaw internal layout, not a manifest contract.
- **$HOME expansion** — mirror path uses `$HOME` in double-quoted SSH
commands for dynamic resolution. Safe because filenames are
pre-validated to the strict charset.
- **skipRefresh on updates** — re-installing a skill should not destroy
chat history.

Closes NVIDIA#1844

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added sandbox "skill install" command: installs from SKILL.md,
enforces frontmatter and safe relative paths, uploads files over SSH
preserving structure, reports skipped dotfiles and unsafe paths,
performs agent-specific post-install follow-up and verification, and
updated CLI help.
* Agents can now trigger a "reload skills" action and report discovered
skills; startup/tool listings reflect this.

* **Tests**
* Added tests covering frontmatter parsing, path validation, file
collection, install/upload flow, and agent-specific path resolution.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Senthil Ravichandran <senthilr@nvidia.com>
Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@wscurran wscurran added NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. labels Apr 14, 2026
cv pushed a commit that referenced this pull request Apr 14, 2026
## Summary
- Document tier-based policy selector (Restricted/Balanced/Open) in
commands, network policies, and customize-network-policy pages (from
#1753)
- Document configurable port overrides via environment variables
(`NEMOCLAW_GATEWAY_PORT`, `NEMOCLAW_DASHBOARD_PORT`,
`NEMOCLAW_VLLM_PORT`, `NEMOCLAW_OLLAMA_PORT`) (from #1645)
- Document `nemoclaw <sandbox> skill install <path>` command (from
#1845, #1856)
- Document reserved sandbox name validation — CLI command collision
check (from #1773)
- Bump doc version switcher through 0.0.15
- Remove `--dangerously-skip-permissions` from onboard usage synopsis
(docs-skip violation)
- Regenerate agent skills from updated docs

## Test plan
- [x] `make docs` builds without warnings
- [x] All pre-commit hooks pass
- [ ] Verify rendered pages in docs build output
- [ ] Cross-references resolve correctly (`policy-tiers` anchor,
`environment-variables` section)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ColinM-sys pushed a commit to ColinM-sys/NemoClaw that referenced this pull request Apr 14, 2026
…NVIDIA#1845)

## Summary

- Add `nemoclaw <sandbox> skill install <path>` to deploy a skill
directory to a running sandbox over SSH.
- Validate YAML frontmatter via the `yaml` library (`name:` required,
safe-character enforcement). Malformed YAML is rejected before upload.
- Upload all non-dot regular files in the skill directory (for example
`SKILL.md` and `scripts/`), preserving subdirectory structure. Reject
files with unsafe path characters, and fail the install if any primary
upload fails. OpenClaw mirror failures are warning-only.
- Manifest-driven upload path: writes to
`{immutableDir}/skills/{name}/`, which symlinks through to writable
backing store.
- OpenClaw-specific post-install: mirror to `$HOME/.openclaw/skills/`
and clear `sessions.json` for skill re-discovery. Non-OpenClaw agents
get a "restart gateway" hint.
- Detect existing skill to differentiate "installed" vs "updated"; skip
session clearing on updates to preserve chat history.

## Security

- **Filename validation:** All relative paths validated against
`[A-Za-z0-9._-/]` before upload. Files with shell metacharacters are
rejected with a hard error, preventing shell injection via crafted
filenames.
- **Shell quoting:** All remote path interpolation uses `shellQuote()`
(from `runner.ts`) as defense in depth.
- **Dotfile filtering:** Dotfiles (`.env`, `.npmrc`, etc.) are excluded
from upload by default. Skipped dotfiles are listed so users are aware.

## Test Plan

- [x] 24 unit tests across 5 describe blocks: frontmatter parsing, path
resolution, `validateRelativePath`, `shellQuote`, `collectFiles`
- [x] Validated with a sample NASA APOD skill (markdown-only, uses
`curl` + public API) on DGX Spark
- [x] Verified: file presence + symlink + ownership + agent discovery +
chat usage + update-preserves-sessions

## Key Design Decisions

- **Full directory upload** — not just SKILL.md. Supports richer skills
with sibling files (e.g. `scripts/` alongside `SKILL.md`), matching the
pattern used by existing multi-file skill directories in the E2E tests.
- **Mirror is OpenClaw-only** — not manifest-driven. Other agents
(Hermes) resolve skills solely from `immutableDir`.
- **Session refresh is OpenClaw-only** — `sessions.json` path is an
OpenClaw internal layout, not a manifest contract.
- **$HOME expansion** — mirror path uses `$HOME` in double-quoted SSH
commands for dynamic resolution. Safe because filenames are
pre-validated to the strict charset.
- **skipRefresh on updates** — re-installing a skill should not destroy
chat history.

Closes NVIDIA#1844

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added sandbox "skill install" command: installs from SKILL.md,
enforces frontmatter and safe relative paths, uploads files over SSH
preserving structure, reports skipped dotfiles and unsafe paths,
performs agent-specific post-install follow-up and verification, and
updated CLI help.
* Agents can now trigger a "reload skills" action and report discovered
skills; startup/tool listings reflect this.

* **Tests**
* Added tests covering frontmatter parsing, path validation, file
collection, install/upload flow, and agent-specific path resolution.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Senthil Ravichandran <senthilr@nvidia.com>
Co-authored-by: Aaron Erickson 🦞 <aerickson@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: ColinM-sys <cmcdonough@50words.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add nemoclaw <sandbox> skill install command for custom skill deployment

3 participants