feat: recommend optimal NIM model based on detected GPU VRAM by brianwtaylor · Pull Request #270 · NVIDIA/NemoClaw

brianwtaylor · 2026-03-18T01:40:33Z

Summary

Adds suggestModelsForGpu() that ranks NIM models by VRAM fit and marks the optimal model (using <=90% VRAM) as recommended
Surfaces GPU name in NVIDIA detection for better onboarding display
Integrates model recommendation into onboarding: interactive mode shows "(recommended)" tags, non-interactive mode auto-selects the best fit

Addresses #66 — users select models that don't fit their GPU, leading to pull failures or OOM crashes. This PR guides them to the right model based on detected VRAM.

Test plan

Automated Tests

node --test test/nim.test.js

Tests cover VRAM-based model ranking, recommended flag assignment, edge cases (no GPU, non-nimCapable GPU, exact VRAM fit), and GPU detection with name enrichment.

coderabbitai · 2026-03-18T01:40:42Z

📝 Walkthrough

Walkthrough

Introduce dependency-injectable GPU detection, richer GPU metadata, a new model-suggestion API that ranks and flags a recommended model for given GPU capabilities, integrate recommendations into onboarding selection flow, and extend tests to cover detection and suggestion logic.

Changes

Cohort / File(s)	Summary
GPU detection & model suggestion `bin/lib/nim.js`	Refactored `detectGpu` to `detectGpu(opts)` with injectable `runCapture` and `platform`; adds richer GPU fields (`name`, optional `cores`, `spark`); added `suggestModelsForGpu(gpu)` to filter/sort models and mark one `recommended`; additional JSDoc-wrapped helpers exported.
Onboarding integration `bin/lib/onboard.js`	Switched model candidate source to `nim.suggestModelsForGpu(gpu)`; display now includes GPU name when present; interactive listings append `(recommended)`; non-interactive defaults prefer the recommended model.
Tests `test/nim.test.js`	Added injected-`runCapture` tests exercising NVIDIA (single/multi/GB10), DGX Spark handling, macOS discrete and unified-memory detection, no-GPU path, and comprehensive `suggestModelsForGpu` behavior (filtering, sorting, single recommended model, ≤90% constraint).

Sequence Diagram(s)

sequenceDiagram
  participant Onboard as Onboard CLI
  participant Nim as bin/lib/nim.js
  participant Runner as runCapture/runner
  participant User as User

  Onboard->>Nim: detectGpu(opts with runCapture, platform)
  Nim->>Runner: run platform-specific commands
  Runner-->>Nim: command outputs
  Nim-->>Onboard: gpu object (type,count,totalMemoryMB, name, cores?, spark?)
  Onboard->>Nim: suggestModelsForGpu(gpu)
  Nim-->>Onboard: models[] (one entry marked recommended)
  Onboard->>User: present model choices (default = recommended)
  User-->>Onboard: selected model
  Onboard->>Nim: startNimContainer(selectedModel)
  Nim->>Runner: pull/start container
  Runner-->>Nim: container status
  Nim-->>Onboard: nimStatus

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 I sniffed the GPUs beneath the moon,

counted memory, hummed a tune.
I picked a model snug and tight,
flagged the one that fits just right. 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title directly and clearly summarizes the main change: introducing GPU-aware model recommendation based on detected VRAM.
Docstring Coverage	✅ Passed	Docstring coverage is 84.62% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

bin/lib/onboard.js (1)

448-476: ⚠️ Potential issue | 🟠 Major

Use the recommended model as the default selection path.

nim.suggestModelsForGpu(gpu) computes a recommended candidate, but the current defaults still pick models[0] (largest fit) in both interactive and non-interactive flows. That can bypass the intended VRAM headroom and increase startup/OOM failures.

Suggested fix

       } else {
         let sel;
+        const defaultModelIndex = Math.max(0, models.findIndex((m) => m.recommended));
         if (isNonInteractive()) {
           if (requestedModel) {
             sel = models.find((m) => m.name === requestedModel);
             if (!sel) {
               console.error(`  Unsupported NEMOCLAW_MODEL for NIM: ${requestedModel}`);
               process.exit(1);
             }
           } else {
-            sel = models[0];
+            sel = models[defaultModelIndex];
           }
           console.log(`  [non-interactive] NIM model: ${sel.name}`);
         } else {
           console.log("");
           console.log("  Models that fit your GPU:");
           models.forEach((m, i) => {
             const tag = m.recommended ? " (recommended)" : "";
             console.log(`    ${i + 1}) ${m.name} (min ${m.minGpuMemoryMB} MB)${tag}`);
           });
           console.log("");

-          const modelChoice = await prompt(`  Choose model [1]: `);
-          const midx = parseInt(modelChoice || "1", 10) - 1;
-          sel = models[midx] || models[0];
+          const defaultChoice = String(defaultModelIndex + 1);
+          const modelChoice = await prompt(`  Choose model [${defaultChoice}]: `);
+          const midx = parseInt(modelChoice || defaultChoice, 10) - 1;
+          sel = models[midx] || models[defaultModelIndex];
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 448 - 476, The selection logic for NIM
models currently defaults to models[0] in both interactive and non-interactive
flows; change it to prefer the model marked recommended by
nim.suggestModelsForGpu(gpu). Update the non-interactive branch in the sel
assignment so that when requestedModel is not provided it picks models.find(m =>
m.recommended) || models[0]; in the interactive branch, set the default choice
index (used when modelChoice is empty) to the index of the recommended model
(const defaultIndex = models.findIndex(m => m.recommended); use defaultIndex >=
0 ? defaultIndex : 0) and then compute midx relative to that default; keep the
requestedModel validation (models.find) and fallback behavior intact.

bin/lib/nim.js (1)

75-94: ⚠️ Potential issue | 🟡 Minor

Include name in the GB10 fallback return object.

The fallback detects GB10 from nameOutput but drops it from the returned GPU object. This causes NVIDIA name display to be missing in onboarding for Spark systems.

Suggested fix

   try {
     const nameOutput = runCmd(
       "nvidia-smi --query-gpu=name --format=csv,noheader,nounits",
       { ignoreError: true }
     );
     if (nameOutput && nameOutput.includes("GB10")) {
+      const name = nameOutput.split("\n")[0].trim();
       // GB10 has 128GB unified memory shared with Grace CPU — use system RAM
       let totalMemoryMB = 0;
       try {
         const memLine = runCmd("free -m | awk '/Mem:/ {print $2}'", { ignoreError: true });
         if (memLine) totalMemoryMB = parseInt(memLine.trim(), 10) || 0;
       } catch {}
       return {
         type: "nvidia",
+        name,
         count: 1,
         totalMemoryMB,
         perGpuMB: totalMemoryMB,
         nimCapable: true,
         spark: true,
       };
     }
   } catch {}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 75 - 94, The GB10 special-case branch detects
the GPU via nameOutput but omits the GPU name in the returned object; update the
return in that branch (the block using runCmd and memLine) to include a name
property set from nameOutput (e.g., trimmed/first-line value) so the GPU name is
preserved for onboarding display (refer to nameOutput, runCmd, memLine in the
nim.js GB10 branch).

🧹 Nitpick comments (1)

bin/lib/nim.js (1)
141-143: Remove the stray JSDoc block above suggestModelsForGpu.

The first JSDoc line describes pullNimImage(model) but is attached to suggestModelsForGpu(gpu), which makes generated docs misleading.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/nim.js` around lines 141 - 143, The JSDoc for pullNimImage(model) is
mistakenly placed immediately above suggestModelsForGpu(gpu), causing incorrect
documentation; move the JSDoc block so it directly precedes the pullNimImage
function or delete the stray JSDoc above suggestModelsForGpu. Locate the comment
block and either relocate it to the pullNimImage declaration or remove it
entirely so suggestModelsForGpu only has its correct JSDoc (or none).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@bin/lib/nim.js`:
- Around line 75-94: The GB10 special-case branch detects the GPU via nameOutput
but omits the GPU name in the returned object; update the return in that branch
(the block using runCmd and memLine) to include a name property set from
nameOutput (e.g., trimmed/first-line value) so the GPU name is preserved for
onboarding display (refer to nameOutput, runCmd, memLine in the nim.js GB10
branch).

In `@bin/lib/onboard.js`:
- Around line 448-476: The selection logic for NIM models currently defaults to
models[0] in both interactive and non-interactive flows; change it to prefer the
model marked recommended by nim.suggestModelsForGpu(gpu). Update the
non-interactive branch in the sel assignment so that when requestedModel is not
provided it picks models.find(m => m.recommended) || models[0]; in the
interactive branch, set the default choice index (used when modelChoice is
empty) to the index of the recommended model (const defaultIndex =
models.findIndex(m => m.recommended); use defaultIndex >= 0 ? defaultIndex : 0)
and then compute midx relative to that default; keep the requestedModel
validation (models.find) and fallback behavior intact.

---

Nitpick comments:
In `@bin/lib/nim.js`:
- Around line 141-143: The JSDoc for pullNimImage(model) is mistakenly placed
immediately above suggestModelsForGpu(gpu), causing incorrect documentation;
move the JSDoc block so it directly precedes the pullNimImage function or delete
the stray JSDoc above suggestModelsForGpu. Locate the comment block and either
relocate it to the pullNimImage declaration or remove it entirely so
suggestModelsForGpu only has its correct JSDoc (or none).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8ae6ec2a-40d1-4187-b422-7b7a3d97d285

📥 Commits

Reviewing files that changed from the base of the PR and between 2b5febe and 746b69b.

📒 Files selected for processing (3)

bin/lib/nim.js
bin/lib/onboard.js
test/nim.test.js

wscurran · 2026-03-18T22:10:04Z

Good suggestion. Other users are reporting the same experience in misfitting model to gpu.

Add dependency injection to detectGpu() via an optional opts parameter, enabling deterministic tests for all 4 code paths: standard NVIDIA, DGX Spark GB10 unified memory, Apple Silicon, and no-GPU fallback. Signed-off-by: Brian Taylor <brian@briantaylor.xyz> Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>

- Add test for the sysctl hw.memsize fallback when system_profiler reports no VRAM (the actual Apple Silicon code path) - Rename existing Apple test to clarify it covers the discrete VRAM parsing branch - Use more specific mock pattern "query-gpu=name" to avoid substring collisions

Aligns runtime behavior with JSDoc contract (cores?: number). When system_profiler does not report core count, the property is now omitted entirely rather than set to null.

Adds suggestModelsForGpu() that ranks NIM models by VRAM fit and marks the optimal model as recommended. Also surfaces GPU name in NVIDIA detection for better display during onboarding.

…e GB10 name, remove stray JSDoc - Default interactive and non-interactive model selection to the recommended model instead of models[0] - Include GPU name in the GB10 fallback return so Spark users see their GPU name during onboarding - Remove stray pullNimImage JSDoc attached to suggestModelsForGpu - Add name assertion to DGX Spark GB10 test

brianwtaylor force-pushed the feat/gpu-model-preselector branch 2 times, most recently from e59d8d0 to 746b69b Compare March 18, 2026 20:36

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

wscurran added Getting Started Use this label to identify setup, installation, or onboarding issues. Local Models Running NemoClaw with local models labels Mar 18, 2026

wscurran added the priority: medium Issue that should be addressed in upcoming releases label Mar 18, 2026

brianwtaylor added 7 commits March 18, 2026 18:12

docs: add JSDoc to detectGpu for docstring coverage check

174f4a1

fix: omit cores property when unknown instead of returning null

cdfee7e

Aligns runtime behavior with JSDoc contract (cores?: number). When system_profiler does not report core count, the property is now omitted entirely rather than set to null.

docs: add JSDoc to all nim.js functions for docstring coverage

74ad92d

feat(nim): add GPU model pre-selector with recommended model

9fbea7e

Adds suggestModelsForGpu() that ranks NIM models by VRAM fit and marks the optimal model as recommended. Also surfaces GPU name in NVIDIA detection for better display during onboarding.

brianwtaylor force-pushed the feat/gpu-model-preselector branch from 3d4833b to a990641 Compare March 19, 2026 01:12

brianwtaylor closed this by deleting the head repository Mar 20, 2026

This was referenced Mar 21, 2026

feat: recommend optimal NIM model based on detected GPU VRAM #542

Closed

feat: recommend optimal NIM model based on detected GPU VRAM #537

Closed

wscurran assigned kjw3 Mar 23, 2026

wscurran removed the priority: medium Issue that should be addressed in upcoming releases label Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: recommend optimal NIM model based on detected GPU VRAM#270

feat: recommend optimal NIM model based on detected GPU VRAM#270
brianwtaylor wants to merge 7 commits intoNVIDIA:mainfrom
brianwtaylor:feat/gpu-model-preselector

brianwtaylor commented Mar 18, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 18, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

wscurran commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

brianwtaylor commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Automated Tests

Uh oh!

coderabbitai bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

wscurran commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

brianwtaylor commented Mar 18, 2026 •

edited

Loading

coderabbitai bot commented Mar 18, 2026 •

edited

Loading