Skip to content

fix: use SSH to stream sandbox logs instead of non-existent openshell subcommand#441

Closed
ericksoa wants to merge 1 commit intomainfrom
fix/issue-424-logs-command
Closed

fix: use SSH to stream sandbox logs instead of non-existent openshell subcommand#441
ericksoa wants to merge 1 commit intomainfrom
fix/issue-424-logs-command

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented Mar 19, 2026

Summary

  • Fixes nemoclaw logs --follow failing #424nemoclaw <name> logs --follow was calling openshell sandbox logs, which doesn't exist in openshell v0.1.0
  • Rewrote sandboxLogs to use the SSH pattern (same as telegram-bridge.js and test-full-e2e.sh): verify sandbox via openshell sandbox get, get SSH config via openshell sandbox ssh-config, then ssh -F <config> openshell-<name> 'tail [-f] -n 50 /tmp/nemoclaw.log /tmp/openclaw.log'
  • Added -f as short alias for --follow

Test plan

  • nemoclaw <name> logs — shows last 50 lines, exits 0
  • nemoclaw <name> logs --follow — streams in real time, Ctrl+C exits cleanly
  • nemoclaw <name> logs -f — same as --follow
  • Stopped/deleted sandbox — prints "not running" message, exits 0
  • node --test test/cli.test.js — 7/7 pass
  • No orphan ssh/tail processes after Ctrl+C
  • nemoclaw <name> status and nemoclaw <name> connect — no regressions

Summary by CodeRabbit

  • New Features

    • Added support for both -f and --follow flags when fetching sandbox logs.
  • Bug Fixes

    • Improved sandbox log retrieval with pre-flight verification to ensure the sandbox is running and accessible before attempting to fetch logs.
    • Enhanced error handling for log operations with clearer feedback when sandbox access fails.

… subcommand

Fixes #424. The sandboxLogs function was calling `openshell sandbox logs`
which does not exist. Now uses the same SSH pattern as telegram-bridge.js
and test-full-e2e.sh: `openshell sandbox get` to verify the sandbox is
running, `openshell sandbox ssh-config` to get SSH config, then
`ssh -F <config> openshell-<name> 'tail [-f] -n 50 ...'` to stream logs.

Also adds `-f` as a short alias for `--follow`.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 19, 2026

📝 Walkthrough

Walkthrough

The sandboxLogs function was refactored to work around a broken openshell sandbox logs command. It now verifies sandbox existence via openshell sandbox get, retrieves SSH configuration, writes it to a temporary file, streams logs through SSH via tail, and cleans up afterward. CLI parsing was updated to accept both -f and --follow flags.

Changes

Cohort / File(s) Summary
Log Streaming via SSH Workaround
bin/nemoclaw.js
Replaced direct openshell sandbox logs call with multi-step SSH-based approach: pre-flight sandbox existence check, SSH config retrieval, temporary config file creation, log streaming via tail (follow or last 50 lines), and guaranteed cleanup. Updated CLI parsing to accept both -f and --follow flags.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A workaround swift with SSH's grace,
No more logs lost in cyberspace!
Temp files dance, then fade away,
Now --follow shows the light of day.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: replacing a non-existent openshell subcommand with an SSH-based approach to stream sandbox logs.
Linked Issues check ✅ Passed The PR implements all requirements from issue #424: verifying sandbox exists, obtaining SSH config, and streaming logs via SSH with follow support.
Out of Scope Changes check ✅ Passed All changes are directly aligned with fixing issue #424; no out-of-scope modifications detected in the CLI logging functionality.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-424-logs-command
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/nemoclaw.js`:
- Around line 292-314: The current bare catch blocks around the execSync calls
that check sandbox existence (the execSync(`openshell sandbox get
"${sandboxName}"`) and execSync(`openshell sandbox ssh-config "${sandboxName}"`)
calls) swallow all errors; change each catch to capture the error (e.g., catch
(err)) and inspect err.stdout/err.stderr/err.message for a specific
"stopped"/"deleted"/"not running"/"not found" indication from openshell, only
then print the "Sandbox '... is not running. No live logs available." message
and return; for any other error rethrow the error so timeouts, missing binary,
and other infra failures surface normally.
- Around line 327-340: The function currently prints an error but still returns
success when ssh exits non-zero; replace that behavior so genuine failures are
propagated by calling exitWithSpawnResult(result). Specifically, in the
sandboxLogs code after spawning ssh (referencing the result and follow
variables), remove the console.error branch and instead: after cleanup, if
result.status !== 0 && result.status !== null then if follow && result.signal
=== 'SIGINT' simply return (suppress expected Ctrl+C in follow mode), otherwise
call exitWithSpawnResult(result) to propagate the failure; keep calls to
fs.unlinkSync(confPath) as-is.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: dd1b8843-9611-46ac-bfa8-cdeb1390bf27

📥 Commits

Reviewing files that changed from the base of the PR and between 6ca7d37 and 46215ee.

📒 Files selected for processing (1)
  • bin/nemoclaw.js

Comment thread bin/nemoclaw.js
Comment on lines +292 to +314
try {
execSync(`openshell sandbox get "${sandboxName}"`, {
encoding: "utf-8",
timeout: 10000,
stdio: ["pipe", "pipe", "pipe"],
});
} catch {
console.log(` Sandbox '${sandboxName}' is not running. No live logs available.`);
return;
}

// Get SSH config for the sandbox
let sshConfig;
try {
sshConfig = execSync(`openshell sandbox ssh-config "${sandboxName}"`, {
encoding: "utf-8",
timeout: 10000,
stdio: ["pipe", "pipe", "pipe"],
});
} catch {
console.log(` Sandbox '${sandboxName}' is not running. No live logs available.`);
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't collapse all preflight failures into the "not running" path.

These bare catch blocks also swallow timeouts, missing binaries, and other openshell failures, so unrelated infrastructure errors get reported as "sandbox not running" and still exit 0. Please key this fallback off the specific stopped/deleted response and let other failures surface normally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/nemoclaw.js` around lines 292 - 314, The current bare catch blocks around
the execSync calls that check sandbox existence (the execSync(`openshell sandbox
get "${sandboxName}"`) and execSync(`openshell sandbox ssh-config
"${sandboxName}"`) calls) swallow all errors; change each catch to capture the
error (e.g., catch (err)) and inspect err.stdout/err.stderr/err.message for a
specific "stopped"/"deleted"/"not running"/"not found" indication from
openshell, only then print the "Sandbox '... is not running. No live logs
available." message and return; for any other error rethrow the error so
timeouts, missing binary, and other infra failures surface normally.

Comment thread bin/nemoclaw.js
Comment on lines +327 to +340
const result = spawnSync("ssh", [
"-T", "-F", confPath,
"-o", "StrictHostKeyChecking=no",
"-o", "UserKnownHostsFile=/dev/null",
"-o", "LogLevel=ERROR",
`openshell-${sandboxName}`,
tailCmd,
], { stdio: "inherit" });

try { fs.unlinkSync(confPath); } catch {}

if (result.status !== 0 && result.status !== null && !follow) {
console.error(` Failed to stream logs (exit ${result.status})`);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Node.js child_process.spawnSync(), how are error, status, and signal populated when the child cannot be started or terminates from SIGINT?

💡 Result:

1) Child cannot be started (e.g. ENOENT, EACCES, bad cwd, etc.)

spawnSync() returns an object where:

  • result.error is set to an Error describing the spawn failure (commonly with error.code like ENOENT, EACCES, etc.). (nodejs.org)
  • result.status is not an exit code (there was no child exit) → typically null. (nodejs.org)
  • result.signal is not a terminating signal → typically null. (nodejs.org)

(Example of spawn failure → result.error set: Error: spawnSync … ENOENT.) (2ality.com)


2) Child terminates due to SIGINT

If the child actually starts and then is terminated by SIGINT:

  • result.status is null (because termination was by signal, not a numeric exit code). (nodejs.org)
  • result.signal is 'SIGINT'. (nodejs.org)
  • result.error is not set just because the child died from a signal (Node documents error for “failed or timed out”). (nodejs.org)

Citations:


🏁 Script executed:

# Find the function containing lines 327-340 in bin/nemoclaw.js
sed -n '300,350p' bin/nemoclaw.js | cat -n

Repository: NVIDIA/NemoClaw

Length of output: 1890


🏁 Script executed:

# Search for the exitWithSpawnResult function
rg "exitWithSpawnResult" --context 3

Repository: NVIDIA/NemoClaw

Length of output: 808


🏁 Script executed:

# Check if this is the 'logs' command mentioned in the review
rg "logs|spawnSync.*ssh" bin/nemoclaw.js --context 2

Repository: NVIDIA/NemoClaw

Length of output: 1639


🏁 Script executed:

# Get the complete exitWithSpawnResult function
rg "function exitWithSpawnResult" -A 10 bin/nemoclaw.js

Repository: NVIDIA/NemoClaw

Length of output: 333


🏁 Script executed:

# Also check if there are any conditionals or alternative handlers for error/signal
sed -n '280,330p' bin/nemoclaw.js | head -60

Repository: NVIDIA/NemoClaw

Length of output: 1621


Propagate SSH failures instead of returning success.

The sandboxLogs function currently returns success even when ssh exits non-zero, making the logs command unreliable in scripts and hiding real transport problems. After cleanup, route genuine failures through exitWithSpawnResult(result) and only suppress the expected Ctrl+C path in follow mode.

Suggested fix
   const result = spawnSync("ssh", [
     "-T", "-F", confPath,
     "-o", "StrictHostKeyChecking=no",
     "-o", "UserKnownHostsFile=/dev/null",
     "-o", "LogLevel=ERROR",
     `openshell-${sandboxName}`,
     tailCmd,
   ], { stdio: "inherit" });

   try { fs.unlinkSync(confPath); } catch {}

-  if (result.status !== 0 && result.status !== null && !follow) {
-    console.error(`  Failed to stream logs (exit ${result.status})`);
-  }
+  if (follow && result.signal === "SIGINT") {
+    return;
+  }
+  if (result.error || result.signal || result.status !== 0) {
+    exitWithSpawnResult(result);
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/nemoclaw.js` around lines 327 - 340, The function currently prints an
error but still returns success when ssh exits non-zero; replace that behavior
so genuine failures are propagated by calling exitWithSpawnResult(result).
Specifically, in the sandboxLogs code after spawning ssh (referencing the result
and follow variables), remove the console.error branch and instead: after
cleanup, if result.status !== 0 && result.status !== null then if follow &&
result.signal === 'SIGINT' simply return (suppress expected Ctrl+C in follow
mode), otherwise call exitWithSpawnResult(result) to propagate the failure; keep
calls to fs.unlinkSync(confPath) as-is.

@ericksoa
Copy link
Copy Markdown
Contributor Author

Closing in favor of #436 which uses the correct openshell logs command — simpler and cleaner.

@ericksoa ericksoa closed this Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

nemoclaw logs --follow failing

1 participant