fix: use SSH to stream sandbox logs instead of non-existent openshell subcommand#441
fix: use SSH to stream sandbox logs instead of non-existent openshell subcommand#441
Conversation
… subcommand Fixes #424. The sandboxLogs function was calling `openshell sandbox logs` which does not exist. Now uses the same SSH pattern as telegram-bridge.js and test-full-e2e.sh: `openshell sandbox get` to verify the sandbox is running, `openshell sandbox ssh-config` to get SSH config, then `ssh -F <config> openshell-<name> 'tail [-f] -n 50 ...'` to stream logs. Also adds `-f` as a short alias for `--follow`.
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/nemoclaw.js`:
- Around line 292-314: The current bare catch blocks around the execSync calls
that check sandbox existence (the execSync(`openshell sandbox get
"${sandboxName}"`) and execSync(`openshell sandbox ssh-config "${sandboxName}"`)
calls) swallow all errors; change each catch to capture the error (e.g., catch
(err)) and inspect err.stdout/err.stderr/err.message for a specific
"stopped"/"deleted"/"not running"/"not found" indication from openshell, only
then print the "Sandbox '... is not running. No live logs available." message
and return; for any other error rethrow the error so timeouts, missing binary,
and other infra failures surface normally.
- Around line 327-340: The function currently prints an error but still returns
success when ssh exits non-zero; replace that behavior so genuine failures are
propagated by calling exitWithSpawnResult(result). Specifically, in the
sandboxLogs code after spawning ssh (referencing the result and follow
variables), remove the console.error branch and instead: after cleanup, if
result.status !== 0 && result.status !== null then if follow && result.signal
=== 'SIGINT' simply return (suppress expected Ctrl+C in follow mode), otherwise
call exitWithSpawnResult(result) to propagate the failure; keep calls to
fs.unlinkSync(confPath) as-is.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: dd1b8843-9611-46ac-bfa8-cdeb1390bf27
📒 Files selected for processing (1)
bin/nemoclaw.js
| try { | ||
| execSync(`openshell sandbox get "${sandboxName}"`, { | ||
| encoding: "utf-8", | ||
| timeout: 10000, | ||
| stdio: ["pipe", "pipe", "pipe"], | ||
| }); | ||
| } catch { | ||
| console.log(` Sandbox '${sandboxName}' is not running. No live logs available.`); | ||
| return; | ||
| } | ||
|
|
||
| // Get SSH config for the sandbox | ||
| let sshConfig; | ||
| try { | ||
| sshConfig = execSync(`openshell sandbox ssh-config "${sandboxName}"`, { | ||
| encoding: "utf-8", | ||
| timeout: 10000, | ||
| stdio: ["pipe", "pipe", "pipe"], | ||
| }); | ||
| } catch { | ||
| console.log(` Sandbox '${sandboxName}' is not running. No live logs available.`); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Don't collapse all preflight failures into the "not running" path.
These bare catch blocks also swallow timeouts, missing binaries, and other openshell failures, so unrelated infrastructure errors get reported as "sandbox not running" and still exit 0. Please key this fallback off the specific stopped/deleted response and let other failures surface normally.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/nemoclaw.js` around lines 292 - 314, The current bare catch blocks around
the execSync calls that check sandbox existence (the execSync(`openshell sandbox
get "${sandboxName}"`) and execSync(`openshell sandbox ssh-config
"${sandboxName}"`) calls) swallow all errors; change each catch to capture the
error (e.g., catch (err)) and inspect err.stdout/err.stderr/err.message for a
specific "stopped"/"deleted"/"not running"/"not found" indication from
openshell, only then print the "Sandbox '... is not running. No live logs
available." message and return; for any other error rethrow the error so
timeouts, missing binary, and other infra failures surface normally.
| const result = spawnSync("ssh", [ | ||
| "-T", "-F", confPath, | ||
| "-o", "StrictHostKeyChecking=no", | ||
| "-o", "UserKnownHostsFile=/dev/null", | ||
| "-o", "LogLevel=ERROR", | ||
| `openshell-${sandboxName}`, | ||
| tailCmd, | ||
| ], { stdio: "inherit" }); | ||
|
|
||
| try { fs.unlinkSync(confPath); } catch {} | ||
|
|
||
| if (result.status !== 0 && result.status !== null && !follow) { | ||
| console.error(` Failed to stream logs (exit ${result.status})`); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Node.js child_process.spawnSync(), how are error, status, and signal populated when the child cannot be started or terminates from SIGINT?
💡 Result:
1) Child cannot be started (e.g. ENOENT, EACCES, bad cwd, etc.)
spawnSync() returns an object where:
result.erroris set to anErrordescribing the spawn failure (commonly witherror.codelikeENOENT,EACCES, etc.). (nodejs.org)result.statusis not an exit code (there was no child exit) → typicallynull. (nodejs.org)result.signalis not a terminating signal → typicallynull. (nodejs.org)
(Example of spawn failure → result.error set: Error: spawnSync … ENOENT.) (2ality.com)
2) Child terminates due to SIGINT
If the child actually starts and then is terminated by SIGINT:
result.statusisnull(because termination was by signal, not a numeric exit code). (nodejs.org)result.signalis'SIGINT'. (nodejs.org)result.erroris not set just because the child died from a signal (Node documentserrorfor “failed or timed out”). (nodejs.org)
Citations:
- 1: https://nodejs.org/api/child_process.html
- 2: https://nodejs.org/api/child_process.html
- 3: https://nodejs.org/api/child_process.html
- 4: https://2ality.com/2022/07/nodejs-child-process.html
- 5: https://nodejs.org/api/child_process.html
- 6: https://nodejs.org/api/child_process.html
- 7: https://nodejs.org/api/child_process.html
🏁 Script executed:
# Find the function containing lines 327-340 in bin/nemoclaw.js
sed -n '300,350p' bin/nemoclaw.js | cat -nRepository: NVIDIA/NemoClaw
Length of output: 1890
🏁 Script executed:
# Search for the exitWithSpawnResult function
rg "exitWithSpawnResult" --context 3Repository: NVIDIA/NemoClaw
Length of output: 808
🏁 Script executed:
# Check if this is the 'logs' command mentioned in the review
rg "logs|spawnSync.*ssh" bin/nemoclaw.js --context 2Repository: NVIDIA/NemoClaw
Length of output: 1639
🏁 Script executed:
# Get the complete exitWithSpawnResult function
rg "function exitWithSpawnResult" -A 10 bin/nemoclaw.jsRepository: NVIDIA/NemoClaw
Length of output: 333
🏁 Script executed:
# Also check if there are any conditionals or alternative handlers for error/signal
sed -n '280,330p' bin/nemoclaw.js | head -60Repository: NVIDIA/NemoClaw
Length of output: 1621
Propagate SSH failures instead of returning success.
The sandboxLogs function currently returns success even when ssh exits non-zero, making the logs command unreliable in scripts and hiding real transport problems. After cleanup, route genuine failures through exitWithSpawnResult(result) and only suppress the expected Ctrl+C path in follow mode.
Suggested fix
const result = spawnSync("ssh", [
"-T", "-F", confPath,
"-o", "StrictHostKeyChecking=no",
"-o", "UserKnownHostsFile=/dev/null",
"-o", "LogLevel=ERROR",
`openshell-${sandboxName}`,
tailCmd,
], { stdio: "inherit" });
try { fs.unlinkSync(confPath); } catch {}
- if (result.status !== 0 && result.status !== null && !follow) {
- console.error(` Failed to stream logs (exit ${result.status})`);
- }
+ if (follow && result.signal === "SIGINT") {
+ return;
+ }
+ if (result.error || result.signal || result.status !== 0) {
+ exitWithSpawnResult(result);
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bin/nemoclaw.js` around lines 327 - 340, The function currently prints an
error but still returns success when ssh exits non-zero; replace that behavior
so genuine failures are propagated by calling exitWithSpawnResult(result).
Specifically, in the sandboxLogs code after spawning ssh (referencing the result
and follow variables), remove the console.error branch and instead: after
cleanup, if result.status !== 0 && result.status !== null then if follow &&
result.signal === 'SIGINT' simply return (suppress expected Ctrl+C in follow
mode), otherwise call exitWithSpawnResult(result) to propagate the failure; keep
calls to fs.unlinkSync(confPath) as-is.
|
Closing in favor of #436 which uses the correct |
Summary
nemoclaw logs --followfailing #424 —nemoclaw <name> logs --followwas callingopenshell sandbox logs, which doesn't exist in openshell v0.1.0sandboxLogsto use the SSH pattern (same astelegram-bridge.jsandtest-full-e2e.sh): verify sandbox viaopenshell sandbox get, get SSH config viaopenshell sandbox ssh-config, thenssh -F <config> openshell-<name> 'tail [-f] -n 50 /tmp/nemoclaw.log /tmp/openclaw.log'-fas short alias for--followTest plan
nemoclaw <name> logs— shows last 50 lines, exits 0nemoclaw <name> logs --follow— streams in real time, Ctrl+C exits cleanlynemoclaw <name> logs -f— same as --follownode --test test/cli.test.js— 7/7 passnemoclaw <name> statusandnemoclaw <name> connect— no regressionsSummary by CodeRabbit
New Features
-fand--followflags when fetching sandbox logs.Bug Fixes