fix: # fix(gateway): replace dead nvidia-device-plugin Helm repo URL#278
fix: # fix(gateway): replace dead nvidia-device-plugin Helm repo URL#278kagura-agent wants to merge 7 commits intoNVIDIA:mainfrom
Conversation
…iant Ollama reasoning models (e.g., nemotron-3-nano, deepseek-r1, qwq) put their output into a 'reasoning' field instead of 'content' in the OpenAI-compatible API response. OpenClaw reads 'content' only, so the agent's response appears blank. This commit: 1. Adds reasoning model detection during Ollama onboarding (both the legacy JS wizard and the new TS onboard command). Known reasoning model patterns are checked against the selected model name. 2. When a reasoning model is detected, the onboarding flow automatically creates a non-reasoning chat variant using 'ollama create' with a simple Modelfile wrapper. This variant writes to 'content' normally. 3. Lists locally available Ollama models during onboarding instead of hardcoding 'nemotron-3-nano', letting users pick from their pulled models and see which ones are reasoning models. 4. Documents the limitation and workaround in the inference provider switching guide. Fixes NVIDIA#246
…board The NVIDIA k8s-device-plugin Helm chart repository at https://nvidia.github.io/k8s-device-plugin returns 404. This URL is hardcoded in the gateway container image's k3s manifest, causing GPU device plugin installation to fail on every GPU-enabled gateway startup. This commit: 1. Adds scripts/fix-gpu-device-plugin.sh that downloads the chart tgz directly from the GitHub release and patches the HelmChart resource to use embedded chartContent (base64-encoded tgz), bypassing the dead Helm repo entirely. 2. Integrates the fix into the legacy onboard wizard (bin/lib/onboard.js) — when a GPU is detected, the patch is applied automatically after gateway startup, before sandbox creation. 3. Documents the issue and manual workaround in the GPU deployment guide. The fix works for both online and air-gapped deployments since the chart is embedded directly into the HelmChart resource. Fixes NVIDIA#241
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughDetects Ollama "reasoning" models during onboarding, lists local Ollama models, can create non-reasoning chat variants, patches GPU device-plugin HelmChart via a helper script, updates docs with troubleshooting, and adds tests for CLI connect/logs and onboard config/validation. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Onboard as Onboard CLI
participant Ollama as Ollama CLI/Daemon
participant FS as Filesystem
participant Logger
User->>Onboard: start onboarding (interactive/non-interactive)
Onboard->>Ollama: request list of local models
Ollama-->>Onboard: returns model list (may include reasoning models)
Onboard->>User: prompt or auto-select model
alt selected model is reasoning
Onboard->>FS: write temporary Modelfile for chat variant
Onboard->>Ollama: `ollama create` chat-variant
Ollama-->>Onboard: create success/failure
Onboard->>Logger: log creation result and guidance
Onboard->>Ollama: select chat-variant if created
else non-reasoning model selected
Onboard->>Ollama: select chosen model
end
Onboard->>Logger: surface final model choice and next steps
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
bin/lib/onboard.js (1)
184-191: Consider scoping the GPU plugin patch to NVIDIA GPUs only.The patch runs for any detected GPU (including Apple Silicon), but the NVIDIA device plugin is only relevant for NVIDIA GPUs. While
ignoreError: trueprevents failures, this adds unnecessary work on non-NVIDIA systems.♻️ Proposed refinement
- if (gpu) { + if (gpu && gpu.type === "nvidia") { console.log(" Patching GPU device plugin (workaround for dead Helm repo)..."); run(`bash "${path.join(SCRIPTS, "fix-gpu-device-plugin.sh")}" 2>&1 || true`, { ignoreError: true }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 184 - 191, The patch currently runs whenever `gpu` is truthy, which can trigger on non-NVIDIA hardware; update the logic in onboard.js so the GPU device-plugin patch only runs for NVIDIA GPUs by checking for an NVIDIA vendor before invoking `run(...)`. Concretely, detect NVIDIA presence (e.g., test for `nvidia-smi` on PATH or inspect vendor via existing GPU detection code), and only call the `run(\`bash "${path.join(SCRIPTS, "fix-gpu-device-plugin.sh")}" ...\`)` when that NVIDIA check succeeds; keep the `gpu` guard but add the NVIDIA-specific condition so Apple Silicon or other GPUs skip the patch. Use the existing symbols `gpu`, `run`, `SCRIPTS`, and `"fix-gpu-device-plugin.sh"` to locate and modify the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 318-320: The fallback assignment of model = "nemotron-3-nano"
conflicts with the KNOWN_REASONING_MODEL_PATTERNS (e.g., /^nemotron.*nano/i) so
the chosen default is inadvertently classified as a reasoning model; change the
fallback to a non-reasoning model name (or make it configurable) — update the
else branch that sets model to use a different default string that does not
match KNOWN_REASONING_MODEL_PATTERNS or add an explicit override/flag to skip
reasoning-model detection for that fallback (refer to the variable model and the
constant KNOWN_REASONING_MODEL_PATTERNS to locate the logic).
- Around line 55-68: The createChatVariant function builds a shell command by
interpolating modelfile and variantName, creating a command-injection risk;
change it to use child_process.execFileSync (or execFileSync) with explicit
arguments instead of a shell string, pass the Modelfile content via stdin (e.g.,
supply modelfile as the first argument to the spawned process's stdin), and call
ollama with ['create', variantName] so modelfile and variantName are never
shell-interpolated; ensure you preserve the same stdio and timeout options and
return variantName on success or null on caught errors.
In `@docs/inference/switch-inference-providers.md`:
- Around line 84-91: Update the variant name to match onboarding's generated
convention: replace instances of "nemotron-nano-chat" used in the ollama create
command and the openshell instruction with the onboarding-style name
"nemotron-3-nano-chat" so the example uses the same "<base-without-tag>-chat"
pattern; ensure the ollama create line (echo "FROM ...") and the openshell
inference set --model value both reference the new "nemotron-3-nano-chat"
variant.
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 404-410: The code falls back to DEFAULT_MODELS when no models are
pulled, which provides hosted NVIDIA IDs that break local Ollama; change the
branch that builds modelOptions (where DEFAULT_MODELS is mapped and promptSelect
is called to set model) to detect a local Ollama endpoint (e.g., check your
Ollama config/endpoint for localhost or a local socket) and, if local, do NOT
use DEFAULT_MODELS; instead present either an empty/interactive choice that
instructs the user to pull local models or prompt for a manual model ID (use a
text prompt alternative to promptSelect) so model remains valid for local Ollama
rather than defaulting to hosted IDs. Ensure this logic lives alongside the
existing model variable assignment and promptSelect usage so callers of model
behave the same.
- Around line 57-66: The execSync call in createOllamaChatVariant is vulnerable
to command injection because it interpolates baseModel and variantName into a
shell string; change it to call the Ollama CLI without shell interpolation (use
spawn/execFile or execFileSync with argument arrays and pass the model content
via stdin) and/or validate/strictly sanitize baseModel and variantName before
use. Specifically, modify the execSync(`echo '${modelfile}' | ollama create
${variantName}`) usage in createOllamaChatVariant to use a non-shell API (e.g.,
spawn/execFile with args ["create", variantName] and write modelfile to the
child's stdin) and ensure baseModel/variantName are restricted to an expected
pattern before invoking the process.
In `@nemoclaw/src/onboard/config.test.ts`:
- Around line 55-57: The test's beforeEach currently calls vi.resetAllMocks(),
which also resets the custom node:fs mock implementation; change this to
vi.clearAllMocks() so you only clear call history and preserve mock
implementations—update the beforeEach that references vi.resetAllMocks() to call
vi.clearAllMocks() and keep the node:fs mock factory intact.
In `@nemoclaw/src/onboard/validate.test.ts`:
- Around line 39-41: The test defines a realistic-looking API key in the
constant apiKey which triggers secret scanners; replace that value with a
clearly synthetic placeholder (e.g., "test-api-key-0000") so tests still run but
do not resemble real secrets—update the apiKey constant in validate.test.ts (and
any direct references to endpoint/apiKey in the same test) to the synthetic
token.
In `@scripts/fix-gpu-device-plugin.sh`:
- Around line 28-39: The script currently ignores curl failures and swallows
kubectl patch errors; update the curl invocation for CHART_URL to use -f (curl
-fsSL "$CHART_URL" -o "$CHART_TGZ") and immediately check its exit status so the
script exits on download HTTP errors, ensure base64 encoding for CHART_TGZ sets
CHART_B64 but also fails the script if base64 returns non-zero, and remove the
"2>/dev/null || true" suppression from the openshell doctor exec -- kubectl
patch helmchart nvidia-device-plugin call so the patch failure surfaces (or
explicitly test its exit code and exit non-zero with an explanatory message)
rather than continuing with invalid chart content.
---
Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 184-191: The patch currently runs whenever `gpu` is truthy, which
can trigger on non-NVIDIA hardware; update the logic in onboard.js so the GPU
device-plugin patch only runs for NVIDIA GPUs by checking for an NVIDIA vendor
before invoking `run(...)`. Concretely, detect NVIDIA presence (e.g., test for
`nvidia-smi` on PATH or inspect vendor via existing GPU detection code), and
only call the `run(\`bash "${path.join(SCRIPTS, "fix-gpu-device-plugin.sh")}"
...\`)` when that NVIDIA check succeeds; keep the `gpu` guard but add the
NVIDIA-specific condition so Apple Silicon or other GPUs skip the patch. Use the
existing symbols `gpu`, `run`, `SCRIPTS`, and `"fix-gpu-device-plugin.sh"` to
locate and modify the code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: bd4deba8-9dea-44ab-b54a-edacc9e9580f
📒 Files selected for processing (9)
bin/lib/onboard.jsdocs/deployment/deploy-to-remote-gpu.mddocs/inference/switch-inference-providers.mdnemoclaw/src/commands/connect.test.tsnemoclaw/src/commands/logs.test.tsnemoclaw/src/commands/onboard.tsnemoclaw/src/onboard/config.test.tsnemoclaw/src/onboard/validate.test.tsscripts/fix-gpu-device-plugin.sh
…cript
- Eliminate shell injection in createChatVariant (JS) and
createOllamaChatVariant (TS) by writing Modelfile to a temp file and
using execFileSync('ollama', ['create', name, '-f', tmpFile])
- Scope GPU device plugin patch to NVIDIA GPUs only (gpu.type === 'nvidia')
- Fix default model conflict: nemotron-3-nano matches reasoning patterns,
use nemotron-3-nano-chat variant as default with setup instructions
- Fix Ollama-with-no-models fallback in TS: suggest pulling local models
instead of offering hosted NVIDIA cloud model IDs
- Add fail-fast error handling in fix-gpu-device-plugin.sh: check download
success, verify non-empty file, report patch failures instead of swallowing
- Replace secret-shaped test token with synthetic placeholder to avoid
Gitleaks scanner blocks
- Switch vi.resetAllMocks() to vi.clearAllMocks() in test files to preserve
vi.mock() factory implementations
- Align docs variant name to nemotron-3-nano-chat (matches onboarding)
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
nemoclaw/src/commands/onboard.ts (2)
416-421: Fallback tollama3.2assumes the model will be pulled later.When no Ollama models are found, the code defaults to
llama3.2and logs suggestions to pull a model. This is reasonable, but the user won't be prompted to actually pull the model before continuing. Consider whether onboarding should pause or provide clearer next steps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/commands/onboard.ts` around lines 416 - 421, The code in the onboarding flow sets model = "llama3.2" when no Ollama models are found (the else branch that logs suggestions), but never forces or prompts the user to actually pull a model; update the onboarding logic in the same function (where model is assigned) to instead halt progress and prompt the user to pull a model (or auto-run the pull if interactive), e.g., present a clear prompt/confirmation to run "ollama pull llama3.2" (or accept a different model name), validate the pull succeeded before assigning model and continuing, and surface any pull errors via logger so downstream code never proceeds with an unavailable model.
39-51: Consider usingexecFileSyncinstead ofexecSyncwith shell command.The
listOllamaModelsfunction shells out tocurl. While this is safe (no user input interpolated), usingexecFileSyncwith explicit arguments would be more consistent with the security-conscious approach increateOllamaChatVariant.♻️ Suggested refactor
function listOllamaModels(): string[] { try { - const raw = execSync("curl -sf http://localhost:11434/api/tags", { + const raw = execFileSync("curl", ["-sf", "http://localhost:11434/api/tags"], { encoding: "utf-8", stdio: ["pipe", "pipe", "pipe"], timeout: 5000, - }); + }).toString(); const data = JSON.parse(raw) as { models?: Array<{ name: string }> }; return (data.models ?? []).map((m) => m.name); } catch { return []; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/commands/onboard.ts` around lines 39 - 51, listOllamaModels currently calls execSync with a shell string that invokes curl; change it to use execFileSync to avoid shell interpretation (e.g., call execFileSync("curl", ["-sf", "http://localhost:11434/api/tags"], { encoding: "utf-8", stdio: ["pipe","pipe","pipe"], timeout: 5000 })) so the same options (encoding, stdio, timeout) are preserved but arguments are passed explicitly; update the call inside listOllamaModels and ensure any imports/types remain correct and error handling (the catch returning []) is kept; follow the same pattern used by createOllamaChatVariant for consistency.scripts/fix-gpu-device-plugin.sh (1)
26-27: Use a unique temp file and cleanup trap for chart artifact.The fixed
/tmpfilename is predictable and leaves artifacts behind. Prefermktemp+trapcleanup.♻️ Proposed refactor
-CHART_TGZ="/tmp/nvidia-device-plugin-${CHART_VERSION}.tgz" +CHART_TGZ="$(mktemp -t nvidia-device-plugin-${CHART_VERSION}.XXXXXX.tgz)" +trap 'rm -f "$CHART_TGZ"' EXIT🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/fix-gpu-device-plugin.sh` around lines 26 - 27, Replace the fixed /tmp filename used by CHART_TGZ with a secure temporary file created via mktemp and register a trap to remove that temp file on EXIT (and on signals) so the chart artifact is unique and always cleaned up; update the CHART_TGZ assignment and add a trap cleanup block referencing CHART_TGZ to ensure removal even on errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/fix-gpu-device-plugin.sh`:
- Around line 52-59: The current success check only inspects .items[0] and exits
when that single pod is Running; change the loop to inspect all pods in the
nvidia-device-plugin namespace (use kubectl get pods -n nvidia-device-plugin -o
jsonpath='{.items[*].status.phase}' via openshell doctor exec) and require at
least one pod in Running AND verify node GPU allocatable by querying nodes
(kubectl get nodes -o jsonpath='{range
.items[*]}{.metadata.name}{"\t"}{.status.allocatable.nvidia\.com/gpu}{"\n"}{end}')
and only return success when a non-zero nvidia.com/gpu allocatable is present on
at least one node; keep retry loop/backoff logic and update the success message
to reflect both checks.
---
Nitpick comments:
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 416-421: The code in the onboarding flow sets model = "llama3.2"
when no Ollama models are found (the else branch that logs suggestions), but
never forces or prompts the user to actually pull a model; update the onboarding
logic in the same function (where model is assigned) to instead halt progress
and prompt the user to pull a model (or auto-run the pull if interactive), e.g.,
present a clear prompt/confirmation to run "ollama pull llama3.2" (or accept a
different model name), validate the pull succeeded before assigning model and
continuing, and surface any pull errors via logger so downstream code never
proceeds with an unavailable model.
- Around line 39-51: listOllamaModels currently calls execSync with a shell
string that invokes curl; change it to use execFileSync to avoid shell
interpretation (e.g., call execFileSync("curl", ["-sf",
"http://localhost:11434/api/tags"], { encoding: "utf-8", stdio:
["pipe","pipe","pipe"], timeout: 5000 })) so the same options (encoding, stdio,
timeout) are preserved but arguments are passed explicitly; update the call
inside listOllamaModels and ensure any imports/types remain correct and error
handling (the catch returning []) is kept; follow the same pattern used by
createOllamaChatVariant for consistency.
In `@scripts/fix-gpu-device-plugin.sh`:
- Around line 26-27: Replace the fixed /tmp filename used by CHART_TGZ with a
secure temporary file created via mktemp and register a trap to remove that temp
file on EXIT (and on signals) so the chart artifact is unique and always cleaned
up; update the CHART_TGZ assignment and add a trap cleanup block referencing
CHART_TGZ to ensure removal even on errors.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 16b70437-a4c4-4e69-b526-ff46b3b197bc
📒 Files selected for processing (6)
bin/lib/onboard.jsdocs/inference/switch-inference-providers.mdnemoclaw/src/commands/onboard.tsnemoclaw/src/onboard/config.test.tsnemoclaw/src/onboard/validate.test.tsscripts/fix-gpu-device-plugin.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- nemoclaw/src/onboard/config.test.ts
| # Wait for device plugin to come up | ||
| echo "Waiting for device plugin pods..." | ||
| for i in $(seq 1 30); do | ||
| status=$(openshell doctor exec -- kubectl get pods -n nvidia-device-plugin -o jsonpath='{.items[0].status.phase}' 2>/dev/null || echo "Pending") | ||
| if [ "$status" = "Running" ]; then | ||
| echo "✓ nvidia-device-plugin is running" | ||
| exit 0 | ||
| fi |
There was a problem hiding this comment.
Success check is too weak: one Running pod does not prove GPU allocatable recovery.
Current logic checks only the first pod (.items[0]) and exits success as soon as that one is Running. This can miss real states in multi-node setups and can still report success before nvidia.com/gpu is allocatable on nodes.
🔧 Proposed fix
echo "Waiting for device plugin pods..."
for i in $(seq 1 30); do
- status=$(openshell doctor exec -- kubectl get pods -n nvidia-device-plugin -o jsonpath='{.items[0].status.phase}' 2>/dev/null || echo "Pending")
- if [ "$status" = "Running" ]; then
- echo "✓ nvidia-device-plugin is running"
- exit 0
- fi
+ running_count=$(
+ openshell doctor exec -- kubectl get pods -n nvidia-device-plugin \
+ -o jsonpath='{range .items[*]}{.status.phase}{"\n"}{end}' 2>/dev/null \
+ | grep -c '^Running$' || true
+ )
+ if [ "$running_count" -gt 0 ]; then
+ gpu_nodes=$(
+ openshell doctor exec -- kubectl get nodes \
+ -o jsonpath='{range .items[*]}{.status.allocatable.nvidia\.com/gpu}{"\n"}{end}' 2>/dev/null \
+ | awk '$1 != "" && $1 != "0" {count++} END {print count+0}'
+ )
+ if [ "$gpu_nodes" -gt 0 ]; then
+ echo "✓ nvidia-device-plugin is running and GPUs are allocatable on ${gpu_nodes} node(s)"
+ exit 0
+ fi
+ fi
sleep 2
done#!/usr/bin/env bash
set -euo pipefail
echo "Pods in nvidia-device-plugin namespace:"
openshell doctor exec -- kubectl get pods -n nvidia-device-plugin \
-o custom-columns=NAME:.metadata.name,PHASE:.status.phase --no-headers
echo
echo "Allocatable GPUs per node:"
openshell doctor exec -- kubectl get nodes \
-o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.allocatable.nvidia\.com/gpu}{"\n"}{end}'Expected verification result: at least one pod in Running phase and at least one node with non-zero nvidia.com/gpu allocatable.
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 54-54: i appears unused. Verify use (or export if used externally).
(SC2034)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/fix-gpu-device-plugin.sh` around lines 52 - 59, The current success
check only inspects .items[0] and exits when that single pod is Running; change
the loop to inspect all pods in the nvidia-device-plugin namespace (use kubectl
get pods -n nvidia-device-plugin -o jsonpath='{.items[*].status.phase}' via
openshell doctor exec) and require at least one pod in Running AND verify node
GPU allocatable by querying nodes (kubectl get nodes -o jsonpath='{range
.items[*]}{.metadata.name}{"\t"}{.status.allocatable.nvidia\.com/gpu}{"\n"}{end}')
and only return success when a non-zero nvidia.com/gpu allocatable is present on
at least one node; keep retry loop/backoff logic and update the success message
to reflect both checks.
…lback warnings - Add early return in isReasoningModel() to skip names ending with '-chat' - Add user guidance in experimental no-models fallback - Log warning on non-interactive variant creation failure
There was a problem hiding this comment.
🧹 Nitpick comments (1)
nemoclaw/src/commands/onboard.ts (1)
18-89: Consider extracting shared logic to avoid duplication.The reasoning model patterns and helper functions (
isReasoningModel,listOllamaModels,createOllamaChatVariant) are duplicated between this file andbin/lib/onboard.js. If these files share a runtime context, consider extracting to a shared module to keep the patterns in sync.This is a minor observation and can be deferred if the duplication is intentional due to different build/runtime contexts.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/commands/onboard.ts` around lines 18 - 89, Duplicate Ollama helper logic (KNOWN_REASONING_MODEL_PATTERNS, isReasoningModel, listOllamaModels, createOllamaChatVariant) exists in this file and bin/lib/onboard.js; extract these into a single shared module (e.g., ollamaHelpers) exporting the patterns and functions, update this file to import those exports instead of declaring them locally, and update bin/lib/onboard.js to import the same module; ensure exported types/signatures match (string[] return for listOllamaModels, PluginLogger param for createOllamaChatVariant) and that any runtime/build differences (CommonJS vs ESM) are handled by adjusting the import style or build config so both runtimes can consume the shared module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 18-89: Duplicate Ollama helper logic
(KNOWN_REASONING_MODEL_PATTERNS, isReasoningModel, listOllamaModels,
createOllamaChatVariant) exists in this file and bin/lib/onboard.js; extract
these into a single shared module (e.g., ollamaHelpers) exporting the patterns
and functions, update this file to import those exports instead of declaring
them locally, and update bin/lib/onboard.js to import the same module; ensure
exported types/signatures match (string[] return for listOllamaModels,
PluginLogger param for createOllamaChatVariant) and that any runtime/build
differences (CommonJS vs ESM) are handled by adjusting the import style or build
config so both runtimes can consume the shared module.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a5b35224-827c-4d7b-9a9e-ab9a9caa3970
📒 Files selected for processing (2)
bin/lib/onboard.jsnemoclaw/src/commands/onboard.ts
…ktemp
- Align TS fallback model from llama3.2 to nemotron-3-nano-chat for
consistency with JS implementation (prevents reasoning-model blank responses)
- Replace execSync('curl ...') with execFileSync('curl', [...]) in
listOllamaModels to avoid shell interpolation
- Use mktemp + trap in fix-gpu-device-plugin.sh for secure temp file handling
Addresses CodeRabbit review comments on PR NVIDIA#278.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
nemoclaw/src/commands/onboard.ts (1)
418-427:⚠️ Potential issue | 🟠 MajorAvoid silently selecting a model that may not exist locally.
When no local models are found, setting
model = "nemotron-3-nano-chat"can still produce a broken saved config and failed inference routing. Prefer prompting for an explicit local model ID (or aborting with instructions) instead of auto-selecting a likely-missing variant.🛠️ Suggested adjustment
} else { - // No models pulled yet — default to nemotron-3-nano-chat (non-reasoning variant) - // to avoid blank responses. The user still needs to pull the base model and - // create the variant after onboarding. - model = "nemotron-3-nano-chat"; - logger.warn("No models found in local Ollama. Defaulting to nemotron-3-nano-chat."); - logger.warn("After onboarding, run:"); - logger.warn(" ollama pull nemotron-3-nano"); - logger.warn(" echo 'FROM nemotron-3-nano' | ollama create nemotron-3-nano-chat"); + logger.warn("No models found in local Ollama."); + logger.warn("Run `ollama pull <model>` first, then enter a local model ID."); + model = await promptInput("Enter a local Ollama model ID"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/commands/onboard.ts` around lines 418 - 427, The code currently assigns model = "nemotron-3-nano-chat" in the no-local-models branch which can create an invalid saved config; instead, change this branch to abort onboarding (throw or return an error) and log clear, actionable instructions for the user to pull/create a local model (use the existing logger.warn messages as guidance), do not set model to a default value, and ensure the calling code does not write or save a config when model is unset; locate the assignment to model and the surrounding else branch and replace it with an error/exit path that prevents saving a broken model ID.
🧹 Nitpick comments (2)
scripts/fix-gpu-device-plugin.sh (1)
55-55: Unused loop variablei.The loop counter
iis only used to limit iterations, not within the loop body. You could suppress the shellcheck warning by prefixing with_or adding a directive.Optional fix
-for i in $(seq 1 30); do +for _ in $(seq 1 30); do🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/fix-gpu-device-plugin.sh` at line 55, The for-loop uses an unused loop variable `i` (for i in $(seq 1 30); do); change the loop to use a deliberately ignored variable (e.g., `_` instead of `i`) or add a shellcheck directive to suppress the warning so the counter only limits iterations without triggering an unused-variable warning; update the loop declaration that currently uses `i` accordingly.nemoclaw/src/commands/onboard.ts (1)
41-47: Prefer Node HTTP APIs over spawningcurlfor model listing.Spawning
curladds an external binary dependency and can fail even when Node networking works. Usingfetch/httphere would reduce platform coupling and simplify error handling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/commands/onboard.ts` around lines 41 - 47, The function listOllamaModels currently shells out to curl via execFileSync; replace that with a native Node HTTP/fetch call (use global fetch or node: http/https) to GET http://localhost:11434/api/tags inside listOllamaModels, handle JSON parsing, network errors and timeouts via AbortController, and return the same string[] result; ensure you propagate/log errors similarly to the current try/catch and remove the execFileSync dependency and associated stdio/timeout options.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 67-68: Replace the predictable tmp path and non-exclusive writes
for Modelfile with a secure temp directory: use mkdtempSync (e.g.,
mkdtempSync(join(tmpdir(), 'nemoclaw-'))) to create a cryptographically-random
temp directory, join that directory with a stable filename instead of Date.now()
(the variable tmpFile), and write the file using writeFileSync(tmpFile,
modelfile, "utf-8" | with the "wx" flag to ensure exclusive create). After
processing, remove the temp directory atomically with rmSync(tempDir, {
recursive: true, force: true }). Apply the same change to the other occurrence
around the write in the same file (lines referenced in the review).
---
Duplicate comments:
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 418-427: The code currently assigns model = "nemotron-3-nano-chat"
in the no-local-models branch which can create an invalid saved config; instead,
change this branch to abort onboarding (throw or return an error) and log clear,
actionable instructions for the user to pull/create a local model (use the
existing logger.warn messages as guidance), do not set model to a default value,
and ensure the calling code does not write or save a config when model is unset;
locate the assignment to model and the surrounding else branch and replace it
with an error/exit path that prevents saving a broken model ID.
---
Nitpick comments:
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 41-47: The function listOllamaModels currently shells out to curl
via execFileSync; replace that with a native Node HTTP/fetch call (use global
fetch or node: http/https) to GET http://localhost:11434/api/tags inside
listOllamaModels, handle JSON parsing, network errors and timeouts via
AbortController, and return the same string[] result; ensure you propagate/log
errors similarly to the current try/catch and remove the execFileSync dependency
and associated stdio/timeout options.
In `@scripts/fix-gpu-device-plugin.sh`:
- Line 55: The for-loop uses an unused loop variable `i` (for i in $(seq 1 30);
do); change the loop to use a deliberately ignored variable (e.g., `_` instead
of `i`) or add a shellcheck directive to suppress the warning so the counter
only limits iterations without triggering an unused-variable warning; update the
loop declaration that currently uses `i` accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 293bf11f-88ad-49f6-a0ce-d0a19e6fee7b
📒 Files selected for processing (2)
nemoclaw/src/commands/onboard.tsscripts/fix-gpu-device-plugin.sh
| const tmpFile = join(tmpdir(), `nemoclaw-modelfile-${Date.now()}`); | ||
| writeFileSync(tmpFile, modelfile, "utf-8"); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n nemoclaw/src/commands/onboard.ts | head -100Repository: NVIDIA/NemoClaw
Length of output: 4025
🏁 Script executed:
cat nemoclaw/package.json | grep -A 5 '"engines"'Repository: NVIDIA/NemoClaw
Length of output: 154
Use a securely created temporary Modelfile path.
Lines 67–68 use a predictable filename (Date.now()) and non-exclusive write, creating a symlink/overwrite window on shared systems. Use mkdtempSync to create a temporary directory with cryptographic randomness and write exclusively using the "wx" flag. Clean up the directory atomically with rmSync.
Proposed fix
-import { writeFileSync, unlinkSync } from "node:fs";
+import { mkdtempSync, rmSync, writeFileSync } from "node:fs";
import { tmpdir } from "node:os";
import { join } from "node:path";
@@
- const tmpFile = join(tmpdir(), `nemoclaw-modelfile-${Date.now()}`);
- writeFileSync(tmpFile, modelfile, "utf-8");
+ const tmpDir = mkdtempSync(join(tmpdir(), "nemoclaw-modelfile-"));
+ const tmpFile = join(tmpDir, "Modelfile");
+ writeFileSync(tmpFile, modelfile, { encoding: "utf-8", flag: "wx", mode: 0o600 });
@@
- unlinkSync(tmpFile);
+ rmSync(tmpDir, { recursive: true, force: true });
} catch {
// ignore cleanup errors
}Also applies to lines 75–80.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nemoclaw/src/commands/onboard.ts` around lines 67 - 68, Replace the
predictable tmp path and non-exclusive writes for Modelfile with a secure temp
directory: use mkdtempSync (e.g., mkdtempSync(join(tmpdir(), 'nemoclaw-'))) to
create a cryptographically-random temp directory, join that directory with a
stable filename instead of Date.now() (the variable tmpFile), and write the file
using writeFileSync(tmpFile, modelfile, "utf-8" | with the "wx" flag to ensure
exclusive create). After processing, remove the temp directory atomically with
rmSync(tempDir, { recursive: true, force: true }). Apply the same change to the
other occurrence around the write in the same file (lines referenced in the
review).
…l, fix lint - Use mkdtempSync for cryptographically random temp directories instead of predictable Date.now()-based paths (TOCTOU fix) - Prompt user for model ID when no local Ollama models found instead of silently defaulting to potentially non-existent model - Fix unused loop variable in fix-gpu-device-plugin.sh
There was a problem hiding this comment.
🧹 Nitpick comments (1)
nemoclaw/src/commands/onboard.ts (1)
18-36: Consider extracting Ollama helpers to a shared module.
KNOWN_REASONING_MODEL_PATTERNS,isReasoningModel,listOllamaModels, andcreateOllamaChatVariantare duplicated nearly identically inbin/lib/onboard.js. If these need to stay in sync, consider extracting them to a shared module to avoid maintenance drift.♻️ Suggested structure
Create a shared module (e.g.,
shared/ollama-helpers.jsor a TypeScript equivalent) that exports these functions, and import them in both files. This ensures any future pattern additions or bug fixes apply consistently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw/src/commands/onboard.ts` around lines 18 - 36, The duplicated Ollama helper logic (KNOWN_REASONING_MODEL_PATTERNS, isReasoningModel, listOllamaModels, createOllamaChatVariant) should be moved into a single shared module and imported where needed; create a new module that exports those symbols, replace the local copies with imports, and update callers to use the shared exports so future changes (pattern updates or bug fixes) are centralized and no longer duplicated across onboard.ts and the bin/lib variant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@nemoclaw/src/commands/onboard.ts`:
- Around line 18-36: The duplicated Ollama helper logic
(KNOWN_REASONING_MODEL_PATTERNS, isReasoningModel, listOllamaModels,
createOllamaChatVariant) should be moved into a single shared module and
imported where needed; create a new module that exports those symbols, replace
the local copies with imports, and update callers to use the shared exports so
future changes (pattern updates or bug fixes) are centralized and no longer
duplicated across onboard.ts and the bin/lib variant.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0d123abb-eb8a-445d-87ca-6d10796fb9c9
📒 Files selected for processing (3)
bin/lib/onboard.jsnemoclaw/src/commands/onboard.tsscripts/fix-gpu-device-plugin.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/fix-gpu-device-plugin.sh
|
Closing — this PR overlaps with #279 (test files) and bundles unrelated changes (GPU device plugin, docs, onboard). I'll submit focused PRs for individual fixes. |
…dundant cluster rebuild (NVIDIA#278)
Fixes #241
Automated PR via GoGetAJob
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests