feat: recommend optimal NIM model based on detected GPU VRAM by brianwtaylor · Pull Request #537 · NVIDIA/NemoClaw

brianwtaylor · 2026-03-20T23:55:44Z

Continuation of closed #142 and #270.

Summary

Refactor detectGpu() to accept injected dependencies (DI) for runCapture and platform, making it fully testable without hardware
Query GPU name via nvidia-smi --query-gpu=name for display during onboard
Add suggestModelsForGpu(gpu) — filters NIM models by VRAM, sorts descending, marks the largest model at ≤90% VRAM as recommended
Update setupNim() in onboard to use suggestModelsForGpu() with (recommended) tag in model selection
Add JSDoc to all exported functions in nim.js
Add 12 tests: 6 for detectGpu (NVIDIA, multi-GPU, DGX Spark, macOS discrete, Apple Silicon, fallback) and 6 for suggestModelsForGpu

Relates to #404 — improves detectGpu() testability.

Test plan

Automated Tests

npm test -- --grep "nim"

Summary by CodeRabbit

New Features
- Intelligent model recommendation: filters and ranks models by GPU VRAM, marks a recommended default, and surfaces it in prompts and non-interactive defaults.
Bug Fixes / Improvements
- Improved GPU detection (includes GPU names, better macOS/Apple Silicon handling, preserves core/VRAM info).
- More robust health checks with tighter timeouts and configurable ports.
- Clearer error handling for container/image failures.
- Stricter sandbox name validation to prevent invalid names.

coderabbitai · 2026-03-21T00:01:31Z

📝 Walkthrough

Walkthrough

bin/lib/nim.js adds dependency injection to GPU detection, returns richer GPU metadata (NVIDIA name, DGX Spark unified-memory handling), and introduces suggestModelsForGpu(gpu) to rank/recommend models by VRAM fit; container-name validation, error throwing (instead of exits), configurable nimStatus port, and hardened health-check timeouts were added. Tests extended accordingly.

Changes

Cohort / File(s)	Summary
Core GPU detection & helpers `bin/lib/nim.js`	`detectGpu()` → `detectGpu(opts)` accepts injected `runCapture` and `platform`; NVIDIA detection now includes `name`; DGX Spark (GB10) unified-memory handling added; macOS detection may omit `cores` when absent; new `suggestModelsForGpu(gpu)` ranks models by VRAM fit and marks one `recommended`.
Container lifecycle & validation `bin/lib/nim.js`	`containerName(sandboxName)` now validates input (presence, ≤63 chars, lowercase/digits/dashes) and throws `Error` on invalid names; `pullNimImage` and `startNimContainer` throw `Error` for unknown models (replacing `process.exit` behavior); `nimStatus(sandboxName)` → `nimStatus(sandboxName, port=8000)` and health probes add `curl --connect-timeout 2 --max-time 3` and use the provided port.
Tests: GPU, suggestions & container names `test/nim.test.js`	Expanded tests for `containerName` validation and injected `detectGpu` paths (NVIDIA multi‑GPU, DGX GB10, macOS discrete/unified, Apple Silicon, no‑GPU); added `suggestModelsForGpu` tests (filtering by VRAM, ordering, single `recommended` within 90% threshold).
Onboarding changes `bin/lib/onboard.js`	Preflight logs include `gpu.name` when present; onboarding now uses `nim.suggestModelsForGpu(gpu)` and derives the default selection from the model marked `recommended`; interactive prompts display “(recommended)”.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through logs and counted VRAM bright,

I named the GPUs and nudged models right.
I banished exits, made checks kind and neat,
picked one to recommend — a tasty treat.
Carrot-coded cheers for changes light and spry. 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main feature: recommending optimal NIM models based on GPU VRAM detection.
Linked Issues check	✅ Passed	All key requirements from `#142` are met: detectGpu() accepts injected opts, suggestModelsForGpu() ranks models by VRAM, containerName() validates inputs, error handling replaces process.exit(), nimStatus() accepts port parameter, health probes have timeouts, and comprehensive tests cover all GPU detection paths and model recommendation logic.
Out of Scope Changes check	✅ Passed	All changes directly support the core objectives: GPU detection improvements, model recommendation logic, sandbox name validation, error handling hardening, and corresponding test coverage. No unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

bin/lib/nim.js (3)

135-137: ⚠️ Potential issue | 🟠 Major

Avoid process.exit(1) in library functions.

Hard-exiting here can terminate parent CLI/test processes unexpectedly. Throw an error and let callers decide handling.

🛠️ Proposed fix

   if (!image) {
-    console.error(`  Unknown model: ${model}`);
-    process.exit(1);
+    throw new Error(`Unknown model: ${model}`);
   }

Also applies to: 148-150

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 135 - 137, In bin/lib/nim.js replace the hard
process.exit(1) calls in the "Unknown model" handling (the console.error(` 
Unknown model: ${model}`) branches at the locations around lines showing that
message, including the second occurrence near lines 148-150) with throwing an
Error (e.g. throw new Error(`Unknown model: ${model}`)) so the library surface
returns an exception instead of terminating the process; ensure callers of the
functions that perform this model validation are prepared to catch or propagate
the error.

170-172: ⚠️ Potential issue | 🟠 Major

Add per-request timeout to the health probe curl.

The loop timeout does not help if one curl invocation hangs; this can block the polling loop indefinitely.

⏱️ Proposed fix

-      const result = runCapture(`curl -sf http://localhost:${port}/v1/models`, {
+      const result = runCapture(
+        `curl -sf --connect-timeout 2 --max-time 3 http://localhost:${port}/v1/models`,
+        {
         ignoreError: true,
-      });
+        }
+      );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 170 - 172, The health-check curl call can hang
indefinitely; update the invocation that builds the command passed to runCapture
(the line using runCapture(`curl -sf http://localhost:${port}/v1/models`, {
ignoreError: true })) to include per-request curl timeouts (e.g. add --max-time
<seconds> and optionally --connect-timeout <seconds>) so each curl returns
within a bounded time; keep the call signature to runCapture and preserve
ignoreError: true while tuning timeout values appropriate for the probe.

193-206: ⚠️ Potential issue | 🟠 Major

nimStatus health check is pinned to port 8000.

startNimContainer supports custom ports, but nimStatus always probes localhost:8000, which can report false unhealthy states.

🔧 Proposed fix

-/** `@param` {string} sandboxName `@returns` {{running: boolean, healthy?: boolean, container: string, state?: string}} */
-function nimStatus(sandboxName) {
+/** `@param` {string} sandboxName `@param` {number} [port=8000] `@returns` {{running: boolean, healthy?: boolean, container: string, state?: string}} */
+function nimStatus(sandboxName, port = 8000) {
@@
-      const health = runCapture(`curl -sf http://localhost:8000/v1/models 2>/dev/null`, {
+      const health = runCapture(`curl -sf http://localhost:${port}/v1/models 2>/dev/null`, {
         ignoreError: true,
       });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 193 - 206, nimStatus currently probes
localhost:8000 regardless of actual mapping, causing false unhealthy reports;
modify nimStatus (which uses containerName and runCapture) to determine the host
port mapped to the container's internal port instead of hardcoding 8000 — e.g.,
after confirming the container is running, call docker inspect (via runCapture)
to read NetworkSettings.Ports for the container returned by
containerName(sandboxName) to extract the HostPort for the container's service
and then use that host port in the curl health check; ensure this logic handles
unmapped ports and falls back to reporting running:false or a safe default
consistent with startNimContainer's port configuration.

🧹 Nitpick comments (1)

test/nim.test.js (1)

160-175: Add one assertion for cores omission when unavailable.

Current tests verify presence of cores, but not the updated contract that the property is omitted (not null) when absent.

✅ Suggested test addition

+    it("omits cores when system_profiler does not report it", () => {
+      const gpu = nim.detectGpu({
+        platform: "darwin",
+        runCapture: mockRunCapture([
+          ["memory.total", new Error("no nvidia-smi")],
+          ["query-gpu=name", new Error("no nvidia-smi")],
+          ["system_profiler", "Chipset Model: Apple M4\n      VRAM (Total): 8 GB"],
+        ]),
+      });
+      assert.equal(gpu.type, "apple");
+      assert.equal("cores" in gpu, false);
+    });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/nim.test.js` around lines 160 - 175, Add a test asserting that when
Apple GPU detection runs without a "Total Number of Cores" value, the returned
object omits the cores property (i.e., gpu.cores === undefined, not null).
Update or add a test using nim.detectGpu with platform "darwin" and a runCapture
response for "system_profiler" that lacks the "Total Number of Cores" line, then
include an assertion that gpu.cores is undefined (reference nim.detectGpu and
the test in test/nim.test.js).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/nim.js`:
- Around line 10-11: containerName currently interpolates arbitrary sandboxName
into shell strings, creating a command-injection risk; update the
containerName(sandboxName) function to validate and sanitize sandboxName (e.g.,
allow only a strict whitelist such as [a-z0-9-_], or percent-encode/hex-encode
input) and return a safe deterministic string, and/or throw an error on invalid
characters; also ensure callers that build shell commands use the sanitized
containerName (or proper argument-passing APIs) rather than raw sandboxName to
eliminate injection vectors.

---

Outside diff comments:
In `@bin/lib/nim.js`:
- Around line 135-137: In bin/lib/nim.js replace the hard process.exit(1) calls
in the "Unknown model" handling (the console.error(`  Unknown model: ${model}`)
branches at the locations around lines showing that message, including the
second occurrence near lines 148-150) with throwing an Error (e.g. throw new
Error(`Unknown model: ${model}`)) so the library surface returns an exception
instead of terminating the process; ensure callers of the functions that perform
this model validation are prepared to catch or propagate the error.
- Around line 170-172: The health-check curl call can hang indefinitely; update
the invocation that builds the command passed to runCapture (the line using
runCapture(`curl -sf http://localhost:${port}/v1/models`, { ignoreError: true
})) to include per-request curl timeouts (e.g. add --max-time <seconds> and
optionally --connect-timeout <seconds>) so each curl returns within a bounded
time; keep the call signature to runCapture and preserve ignoreError: true while
tuning timeout values appropriate for the probe.
- Around line 193-206: nimStatus currently probes localhost:8000 regardless of
actual mapping, causing false unhealthy reports; modify nimStatus (which uses
containerName and runCapture) to determine the host port mapped to the
container's internal port instead of hardcoding 8000 — e.g., after confirming
the container is running, call docker inspect (via runCapture) to read
NetworkSettings.Ports for the container returned by containerName(sandboxName)
to extract the HostPort for the container's service and then use that host port
in the curl health check; ensure this logic handles unmapped ports and falls
back to reporting running:false or a safe default consistent with
startNimContainer's port configuration.

---

Nitpick comments:
In `@test/nim.test.js`:
- Around line 160-175: Add a test asserting that when Apple GPU detection runs
without a "Total Number of Cores" value, the returned object omits the cores
property (i.e., gpu.cores === undefined, not null). Update or add a test using
nim.detectGpu with platform "darwin" and a runCapture response for
"system_profiler" that lacks the "Total Number of Cores" line, then include an
assertion that gpu.cores is undefined (reference nim.detectGpu and the test in
test/nim.test.js).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 061be972-e91f-44a2-a548-fa5c9eb03b62

📥 Commits

Reviewing files that changed from the base of the PR and between 867f27d and 656eba1.

📒 Files selected for processing (2)

bin/lib/nim.js
test/nim.test.js

coderabbitai

🧹 Nitpick comments (1)

test/nim.test.js (1)
145-176: Minor inconsistency in mock patterns between similar test cases.

Lines 150 and 166 use different patterns ("name" vs "query-gpu=name") for simulating nvidia-smi failure on macOS. Both work since they throw errors, but using consistent patterns would improve test readability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/nim.test.js` around lines 145 - 176, The two tests calling nim.detectGpu
use inconsistent runCapture mock keys ("name" in the first test and
"query-gpu=name" in the second) which reduces readability; update one so both
use the same mock pattern (either both "name" or both "query-gpu=name") when
simulating nvidia-smi failures, e.g., change the first test's mockRunCapture
entry to match the second, and keep references to nim.detectGpu and
mockRunCapture unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/nim.test.js`:
- Around line 145-176: The two tests calling nim.detectGpu use inconsistent
runCapture mock keys ("name" in the first test and "query-gpu=name" in the
second) which reduces readability; update one so both use the same mock pattern
(either both "name" or both "query-gpu=name") when simulating nvidia-smi
failures, e.g., change the first test's mockRunCapture entry to match the
second, and keep references to nim.detectGpu and mockRunCapture unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f9be487f-c563-4dc7-8655-401ae66daba7

📥 Commits

Reviewing files that changed from the base of the PR and between 656eba1 and 2741afd.

📒 Files selected for processing (3)

bin/lib/nim.js
bin/lib/onboard.js
test/nim.test.js

🚧 Files skipped from review as they are similar to previous changes (1)

bin/lib/nim.js

brianwtaylor · 2026-03-21T03:22:04Z

Addressed all CodeRabbit findings in the latest push:

Inline (critical):

containerName() now validates sandboxName against /^[a-zA-Z0-9][a-zA-Z0-9_.-]{0,127}$/ and throws on invalid input

Outside-diff fixes (all 3):

Replaced process.exit(1) with throw new Error() in pullNimImage and startNimContainer — library functions shouldn't hard-exit
Added --connect-timeout 2 --max-time 3 to health probe curl in both waitForNimHealth and nimStatus
nimStatus now accepts an optional port parameter (default 8000) instead of hardcoding it

Nitpicks (both):

Added test for cores property omission when system_profiler doesn't report core count
Normalized mock patterns — both macOS tests now use "query-gpu=name" consistently

coderabbitai

🧹 Nitpick comments (4)

bin/lib/nim.js (4)

201-218: Unused interval variable.

const interval = 5000; on line 201 is declared but never used. The sleep duration is hardcoded to "5" at line 218. Consider using the variable or removing it.

♻️ Use interval variable

   const interval = 5000;
   console.log(`  Waiting for NIM health on port ${port} (timeout: ${timeout}s)...`);

   while ((Date.now() - start) / 1000 < timeout) {
     try {
       const result = runCapture(
         `curl -sf --connect-timeout 2 --max-time 3 http://localhost:${port}/v1/models`,
         {
           ignoreError: true,
         }
       );
       if (result) {
         console.log("  NIM is healthy.");
         return true;
       }
     } catch {}
     // Synchronous sleep via spawnSync
-    require("child_process").spawnSync("sleep", ["5"]);
+    require("child_process").spawnSync("sleep", [String(interval / 1000)]);
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 201 - 218, The local const interval declared
near the NIM health loop is unused; either remove it or use it for the sleep
call and any messaging. Fix by replacing the hardcoded sleep string in the loop
(the spawnSync("sleep", ["5"]) call) with a value derived from interval (e.g.,
spawnSync("sleep", [String(interval / 1000)]) or otherwise convert interval to
seconds), and optionally use interval in the initial status message; the
relevant symbols to update are the interval constant, the spawnSync sleep
invocation, and the surrounding NIM health loop that uses start and runCapture.

160-166: No model recommended when all fitting models exceed 90% VRAM.

If all models that fit within totalMemoryMB require more than 90% of VRAM, no model is marked recommended: true. This may be intentional (conservative), but could leave users without guidance when a model would technically work.

Consider whether to fall back to recommending the smallest fitting model, or document this behavior so callers can handle the no-recommendation case.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 160 - 166, The code currently sets recommended
based on threshold = vram * 0.9 and may leave all models un-recommended if every
fitting model exceeds 90% VRAM; update the logic in the block that iterates fits
(the fits.map using variables threshold, recommended and m.minGpuMemoryMB) so
that if after the map no model has recommended === true you mark the single
fitting model with the smallest minGpuMemoryMB as recommended (i.e., find the
min by m.minGpuMemoryMB among fits and set its recommended flag to true), or
alternatively document this behavior for callers if you prefer to keep
conservative behavior.

64-71: perGpuMB assumes homogeneous GPUs in multi-GPU setups.

Line 69 returns perGpuMB: perGpuMB[0], which is only the first GPU's memory. For heterogeneous multi-GPU systems (e.g., mixed A100/A10), this could mislead downstream logic that assumes uniform GPU memory. Consider documenting this assumption or returning the minimum value for conservative VRAM estimation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 64 - 71, The code currently exposes perGpuMB:
perGpuMB[0], assuming homogeneous GPUs; change this to provide a conservative
estimate and avoid misleading consumers by replacing perGpuMB: perGpuMB[0] with
a single conservative value (e.g., the minimum across the perGpuMB array, such
as Math.min(...perGpuMB)), or alternatively return the full perGpuMB array (and
update any consumers) if heterogeneous reporting is desired; update the returned
object from the function that constructs this device info (the object containing
type, name, count, totalMemoryMB, perGpuMB, nimCapable) so perGpuMB reflects the
minimum VRAM or the full array and add a brief inline comment clarifying the
choice.

181-193: Consider validating port parameter.

The port parameter is interpolated into the shell command at line 193 without validation. While the default is 8000 and typical callers pass integers, invalid values (NaN, negative, >65535) would produce malformed Docker commands or unexpected behavior.

🛡️ Optional validation

 function startNimContainer(sandboxName, model, port = 8000) {
+  if (!Number.isInteger(port) || port < 1 || port > 65535) {
+    throw new Error(`Invalid port: ${port}`);
+  }
   const name = containerName(sandboxName);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 181 - 193, The startNimContainer function
interpolates the port directly into a shell command, which can produce malformed
Docker commands if port is non-numeric or out of range; validate the port
parameter (the port variable) before interpolation: coerce/parse it to an
integer (e.g. parseInt), ensure Number.isInteger and that it falls within
1–65535, and either throw a clear Error (or fallback to the default 8000) when
invalid; update the validation near the top of startNimContainer (before using
port in run/docker run) and keep existing symbols containerName,
getImageForModel and run unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@bin/lib/nim.js`:
- Around line 201-218: The local const interval declared near the NIM health
loop is unused; either remove it or use it for the sleep call and any messaging.
Fix by replacing the hardcoded sleep string in the loop (the spawnSync("sleep",
["5"]) call) with a value derived from interval (e.g., spawnSync("sleep",
[String(interval / 1000)]) or otherwise convert interval to seconds), and
optionally use interval in the initial status message; the relevant symbols to
update are the interval constant, the spawnSync sleep invocation, and the
surrounding NIM health loop that uses start and runCapture.
- Around line 160-166: The code currently sets recommended based on threshold =
vram * 0.9 and may leave all models un-recommended if every fitting model
exceeds 90% VRAM; update the logic in the block that iterates fits (the fits.map
using variables threshold, recommended and m.minGpuMemoryMB) so that if after
the map no model has recommended === true you mark the single fitting model with
the smallest minGpuMemoryMB as recommended (i.e., find the min by
m.minGpuMemoryMB among fits and set its recommended flag to true), or
alternatively document this behavior for callers if you prefer to keep
conservative behavior.
- Around line 64-71: The code currently exposes perGpuMB: perGpuMB[0], assuming
homogeneous GPUs; change this to provide a conservative estimate and avoid
misleading consumers by replacing perGpuMB: perGpuMB[0] with a single
conservative value (e.g., the minimum across the perGpuMB array, such as
Math.min(...perGpuMB)), or alternatively return the full perGpuMB array (and
update any consumers) if heterogeneous reporting is desired; update the returned
object from the function that constructs this device info (the object containing
type, name, count, totalMemoryMB, perGpuMB, nimCapable) so perGpuMB reflects the
minimum VRAM or the full array and add a brief inline comment clarifying the
choice.
- Around line 181-193: The startNimContainer function interpolates the port
directly into a shell command, which can produce malformed Docker commands if
port is non-numeric or out of range; validate the port parameter (the port
variable) before interpolation: coerce/parse it to an integer (e.g. parseInt),
ensure Number.isInteger and that it falls within 1–65535, and either throw a
clear Error (or fallback to the default 8000) when invalid; update the
validation near the top of startNimContainer (before using port in run/docker
run) and keep existing symbols containerName, getImageForModel and run
unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b6b83c8d-613c-4ab7-8c02-78aefd852c69

📥 Commits

Reviewing files that changed from the base of the PR and between 2741afd and e9f56c9.

📒 Files selected for processing (2)

bin/lib/nim.js
test/nim.test.js

🚧 Files skipped from review as they are similar to previous changes (1)

test/nim.test.js

- Validate sandboxName in containerName() to prevent shell injection - Replace process.exit(1) with throw in pullNimImage/startNimContainer - Add per-request curl timeouts (--connect-timeout 2 --max-time 3) to health probes in waitForNimHealth and nimStatus - Accept port parameter in nimStatus instead of hardcoding 8000 - Add containerName validation tests and cores-omission test - Normalize mock patterns in macOS GPU detection tests

coderabbitai

🧹 Nitpick comments (2)

bin/lib/nim.js (2)
153-167: Consider edge case: no model marked recommended if all exceed 90% threshold.

If a GPU's only fitting models require >90% of VRAM (e.g., 9500MB model on 10000MB GPU), the returned array has no recommended: true entry. Downstream code (e.g., onboard.js finding defaultModelIndex) should handle this gracefully.

This may be intentional as a safety margin, but worth confirming the caller handles an empty recommendation.
#!/bin/bash
# Check how onboard.js handles the case where no model is recommended
rg -n -A5 'suggestModelsForGpu|recommended' bin/lib/onboard.js
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 153 - 167, suggestModelsForGpu can return no
recommended model if every fit exceeds the 0.9 threshold; update
suggestModelsForGpu so after building fits and applying the threshold logic
(variables: fits, threshold, recommended) you check if none were marked
recommended and, if fits is non-empty, mark the highest-capacity fit (first
element after the current sort) as recommended; this guarantees downstream logic
(e.g., onboard.js defaultModelIndex lookup) always finds a recommended model
when any fit exists.
64-71: Multi-GPU: perGpuMB reports only the first GPU's memory.

When multiple GPUs are present with different VRAM sizes (e.g., mixed GPU configurations), perGpuMB: perGpuMB[0] reflects only the first GPU. If downstream code uses perGpuMB for per-GPU model sizing decisions, it may over- or under-estimate capacity on heterogeneous setups.

If homogeneous GPUs are assumed, consider adding a brief comment. Otherwise, returning the minimum or the full array may be more robust.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 64 - 71, The returned object currently sets
perGpuMB to only the first GPU (perGpuMB[0]), which misreports capacity in
mixed-GPU systems; update the return in the object (the block that builds the
NVIDIA info) to return the full perGpuMB array (or alternatively return the
minimum value if a single scalar is required by downstream code) and ensure any
downstream consumers expect an array (or adjust them if you choose the minimum).
Also add a short inline comment near perGpuMB explaining the shape (array of
per-GPU VRAMs) so future readers know this handles heterogeneous GPUs; keep
totalMemoryMB and nimCapable unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@bin/lib/nim.js`:
- Around line 153-167: suggestModelsForGpu can return no recommended model if
every fit exceeds the 0.9 threshold; update suggestModelsForGpu so after
building fits and applying the threshold logic (variables: fits, threshold,
recommended) you check if none were marked recommended and, if fits is
non-empty, mark the highest-capacity fit (first element after the current sort)
as recommended; this guarantees downstream logic (e.g., onboard.js
defaultModelIndex lookup) always finds a recommended model when any fit exists.
- Around line 64-71: The returned object currently sets perGpuMB to only the
first GPU (perGpuMB[0]), which misreports capacity in mixed-GPU systems; update
the return in the object (the block that builds the NVIDIA info) to return the
full perGpuMB array (or alternatively return the minimum value if a single
scalar is required by downstream code) and ensure any downstream consumers
expect an array (or adjust them if you choose the minimum). Also add a short
inline comment near perGpuMB explaining the shape (array of per-GPU VRAMs) so
future readers know this handles heterogeneous GPUs; keep totalMemoryMB and
nimCapable unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b550f299-260e-4950-ac5f-da5abb926b26

📥 Commits

Reviewing files that changed from the base of the PR and between e9f56c9 and d926f45.

📒 Files selected for processing (3)

bin/lib/nim.js
bin/lib/onboard.js
test/nim.test.js

🚧 Files skipped from review as they are similar to previous changes (2)

bin/lib/onboard.js
test/nim.test.js

wscurran · 2026-03-23T21:24:49Z

Thanks for submitting this proposed feature to recommend optimal NIM models based on detected GPU VRAM, which could help improve the performance and usability of NemoClaw for users with different GPU configurations.

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

Comment thread bin/lib/nim.js

brianwtaylor force-pushed the test/gpu-detection-spark-paths branch from 656eba1 to 2741afd Compare March 21, 2026 02:36

brianwtaylor changed the title ~~test: add GPU detection tests with dependency injection~~ feat: recommend optimal NIM model based on detected GPU VRAM Mar 21, 2026

brianwtaylor mentioned this pull request Mar 21, 2026

feat: recommend optimal NIM model based on detected GPU VRAM #542

Closed

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

brianwtaylor added 2 commits March 21, 2026 13:27

feat: recommend optimal NIM model based on detected GPU VRAM

78c86a2

brianwtaylor force-pushed the test/gpu-detection-spark-paths branch from e9f56c9 to d926f45 Compare March 21, 2026 20:30

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

wscurran added enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. priority: medium Issue that should be addressed in upcoming releases labels Mar 23, 2026

brianwtaylor closed this Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: recommend optimal NIM model based on detected GPU VRAM#537

feat: recommend optimal NIM model based on detected GPU VRAM#537
brianwtaylor wants to merge 2 commits intoNVIDIA:mainfrom
brianwtaylor:test/gpu-detection-spark-paths

brianwtaylor commented Mar 20, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

brianwtaylor commented Mar 21, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

wscurran commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brianwtaylor commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Automated Tests

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

brianwtaylor commented Mar 21, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

wscurran commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brianwtaylor commented Mar 20, 2026 •

edited

Loading

coderabbitai bot commented Mar 21, 2026 •

edited

Loading