feat: recommend optimal NIM model based on detected GPU VRAM#270
feat: recommend optimal NIM model based on detected GPU VRAM#270brianwtaylor wants to merge 7 commits intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughIntroduce dependency-injectable GPU detection, richer GPU metadata, a new model-suggestion API that ranks and flags a recommended model for given GPU capabilities, integrate recommendations into onboarding selection flow, and extend tests to cover detection and suggestion logic. Changes
Sequence Diagram(s)sequenceDiagram
participant Onboard as Onboard CLI
participant Nim as bin/lib/nim.js
participant Runner as runCapture/runner
participant User as User
Onboard->>Nim: detectGpu(opts with runCapture, platform)
Nim->>Runner: run platform-specific commands
Runner-->>Nim: command outputs
Nim-->>Onboard: gpu object (type,count,totalMemoryMB, name, cores?, spark?)
Onboard->>Nim: suggestModelsForGpu(gpu)
Nim-->>Onboard: models[] (one entry marked recommended)
Onboard->>User: present model choices (default = recommended)
User-->>Onboard: selected model
Onboard->>Nim: startNimContainer(selectedModel)
Nim->>Runner: pull/start container
Runner-->>Nim: container status
Nim-->>Onboard: nimStatus
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
e59d8d0 to
746b69b
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
bin/lib/onboard.js (1)
448-476:⚠️ Potential issue | 🟠 MajorUse the recommended model as the default selection path.
nim.suggestModelsForGpu(gpu)computes arecommendedcandidate, but the current defaults still pickmodels[0](largest fit) in both interactive and non-interactive flows. That can bypass the intended VRAM headroom and increase startup/OOM failures.Suggested fix
} else { let sel; + const defaultModelIndex = Math.max(0, models.findIndex((m) => m.recommended)); if (isNonInteractive()) { if (requestedModel) { sel = models.find((m) => m.name === requestedModel); if (!sel) { console.error(` Unsupported NEMOCLAW_MODEL for NIM: ${requestedModel}`); process.exit(1); } } else { - sel = models[0]; + sel = models[defaultModelIndex]; } console.log(` [non-interactive] NIM model: ${sel.name}`); } else { console.log(""); console.log(" Models that fit your GPU:"); models.forEach((m, i) => { const tag = m.recommended ? " (recommended)" : ""; console.log(` ${i + 1}) ${m.name} (min ${m.minGpuMemoryMB} MB)${tag}`); }); console.log(""); - const modelChoice = await prompt(` Choose model [1]: `); - const midx = parseInt(modelChoice || "1", 10) - 1; - sel = models[midx] || models[0]; + const defaultChoice = String(defaultModelIndex + 1); + const modelChoice = await prompt(` Choose model [${defaultChoice}]: `); + const midx = parseInt(modelChoice || defaultChoice, 10) - 1; + sel = models[midx] || models[defaultModelIndex]; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 448 - 476, The selection logic for NIM models currently defaults to models[0] in both interactive and non-interactive flows; change it to prefer the model marked recommended by nim.suggestModelsForGpu(gpu). Update the non-interactive branch in the sel assignment so that when requestedModel is not provided it picks models.find(m => m.recommended) || models[0]; in the interactive branch, set the default choice index (used when modelChoice is empty) to the index of the recommended model (const defaultIndex = models.findIndex(m => m.recommended); use defaultIndex >= 0 ? defaultIndex : 0) and then compute midx relative to that default; keep the requestedModel validation (models.find) and fallback behavior intact.bin/lib/nim.js (1)
75-94:⚠️ Potential issue | 🟡 MinorInclude
namein the GB10 fallback return object.The fallback detects GB10 from
nameOutputbut drops it from the returned GPU object. This causes NVIDIA name display to be missing in onboarding for Spark systems.Suggested fix
try { const nameOutput = runCmd( "nvidia-smi --query-gpu=name --format=csv,noheader,nounits", { ignoreError: true } ); if (nameOutput && nameOutput.includes("GB10")) { + const name = nameOutput.split("\n")[0].trim(); // GB10 has 128GB unified memory shared with Grace CPU — use system RAM let totalMemoryMB = 0; try { const memLine = runCmd("free -m | awk '/Mem:/ {print $2}'", { ignoreError: true }); if (memLine) totalMemoryMB = parseInt(memLine.trim(), 10) || 0; } catch {} return { type: "nvidia", + name, count: 1, totalMemoryMB, perGpuMB: totalMemoryMB, nimCapable: true, spark: true, }; } } catch {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/nim.js` around lines 75 - 94, The GB10 special-case branch detects the GPU via nameOutput but omits the GPU name in the returned object; update the return in that branch (the block using runCmd and memLine) to include a name property set from nameOutput (e.g., trimmed/first-line value) so the GPU name is preserved for onboarding display (refer to nameOutput, runCmd, memLine in the nim.js GB10 branch).
🧹 Nitpick comments (1)
bin/lib/nim.js (1)
141-143: Remove the stray JSDoc block abovesuggestModelsForGpu.The first JSDoc line describes
pullNimImage(model)but is attached tosuggestModelsForGpu(gpu), which makes generated docs misleading.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/nim.js` around lines 141 - 143, The JSDoc for pullNimImage(model) is mistakenly placed immediately above suggestModelsForGpu(gpu), causing incorrect documentation; move the JSDoc block so it directly precedes the pullNimImage function or delete the stray JSDoc above suggestModelsForGpu. Locate the comment block and either relocate it to the pullNimImage declaration or remove it entirely so suggestModelsForGpu only has its correct JSDoc (or none).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@bin/lib/nim.js`:
- Around line 75-94: The GB10 special-case branch detects the GPU via nameOutput
but omits the GPU name in the returned object; update the return in that branch
(the block using runCmd and memLine) to include a name property set from
nameOutput (e.g., trimmed/first-line value) so the GPU name is preserved for
onboarding display (refer to nameOutput, runCmd, memLine in the nim.js GB10
branch).
In `@bin/lib/onboard.js`:
- Around line 448-476: The selection logic for NIM models currently defaults to
models[0] in both interactive and non-interactive flows; change it to prefer the
model marked recommended by nim.suggestModelsForGpu(gpu). Update the
non-interactive branch in the sel assignment so that when requestedModel is not
provided it picks models.find(m => m.recommended) || models[0]; in the
interactive branch, set the default choice index (used when modelChoice is
empty) to the index of the recommended model (const defaultIndex =
models.findIndex(m => m.recommended); use defaultIndex >= 0 ? defaultIndex : 0)
and then compute midx relative to that default; keep the requestedModel
validation (models.find) and fallback behavior intact.
---
Nitpick comments:
In `@bin/lib/nim.js`:
- Around line 141-143: The JSDoc for pullNimImage(model) is mistakenly placed
immediately above suggestModelsForGpu(gpu), causing incorrect documentation;
move the JSDoc block so it directly precedes the pullNimImage function or delete
the stray JSDoc above suggestModelsForGpu. Locate the comment block and either
relocate it to the pullNimImage declaration or remove it entirely so
suggestModelsForGpu only has its correct JSDoc (or none).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8ae6ec2a-40d1-4187-b422-7b7a3d97d285
📒 Files selected for processing (3)
bin/lib/nim.jsbin/lib/onboard.jstest/nim.test.js
|
Good suggestion. Other users are reporting the same experience in misfitting model to gpu. |
Add dependency injection to detectGpu() via an optional opts parameter, enabling deterministic tests for all 4 code paths: standard NVIDIA, DGX Spark GB10 unified memory, Apple Silicon, and no-GPU fallback. Signed-off-by: Brian Taylor <brian@briantaylor.xyz> Signed-off-by: Brian Taylor <brian.taylor818@gmail.com>
- Add test for the sysctl hw.memsize fallback when system_profiler reports no VRAM (the actual Apple Silicon code path) - Rename existing Apple test to clarify it covers the discrete VRAM parsing branch - Use more specific mock pattern "query-gpu=name" to avoid substring collisions
Aligns runtime behavior with JSDoc contract (cores?: number). When system_profiler does not report core count, the property is now omitted entirely rather than set to null.
Adds suggestModelsForGpu() that ranks NIM models by VRAM fit and marks the optimal model as recommended. Also surfaces GPU name in NVIDIA detection for better display during onboarding.
…e GB10 name, remove stray JSDoc - Default interactive and non-interactive model selection to the recommended model instead of models[0] - Include GPU name in the GB10 fallback return so Spark users see their GPU name during onboarding - Remove stray pullNimImage JSDoc attached to suggestModelsForGpu - Add name assertion to DGX Spark GB10 test
3d4833b to
a990641
Compare
Summary
suggestModelsForGpu()that ranks NIM models by VRAM fit and marks the optimal model (using <=90% VRAM) as recommendedAddresses #66 — users select models that don't fit their GPU, leading to pull failures or OOM crashes. This PR guides them to the right model based on detected VRAM.
Test plan
Automated Tests
Tests cover VRAM-based model ranking, recommended flag assignment, edge cases (no GPU, non-nimCapable GPU, exact VRAM fit), and GPU detection with name enrichment.