feat: recommend optimal NIM model based on detected GPU VRAM by brianwtaylor · Pull Request #542 · NVIDIA/NemoClaw

brianwtaylor · 2026-03-21T00:19:42Z

Summary

Add suggestModelsForGpu(gpu) to nim.js — ranks NIM models by fit for detected GPU, marks the largest model using ≤90% VRAM as recommended
Integrate model suggestions into the onboarding flow — default selection is now the recommended model (not the largest)
Add dependency-injectable GPU detection (detectGpu(opts)) for testability
Include GPU name in DGX Spark GB10 fallback path
Add comprehensive tests: GPU detection (NVIDIA, Apple Silicon, unified memory, DGX Spark, fallback) and model suggestion (filtering, sorting, recommendation logic)

Motivation

Users frequently select NIM models too large for their GPU, causing OOM failures during startup. This is especially common on DGX Spark where the unified memory architecture makes VRAM limits non-obvious.

Continuation of closed #270. wscurran noted: "Other users are reporting the same experience in misfitting model to gpu."

Related to #404 — Jetson Orin Nano detectGpu() returns null on unified memory; the DI refactor and fallback path improvements here help address this
Related to #511 — Jetson AGX (aarch64) installation failures include GPU detection as a contributing factor

Test plan

Automated Tests

npm test -- --grep "suggestModelsForGpu|detectGpu"

coderabbitai · 2026-03-21T00:20:03Z

📝 Walkthrough

Walkthrough

This PR introduces GPU model recommendation by refactoring GPU detection to support dependency injection, adding GPU name enrichment for NVIDIA devices, creating a new suggestModelsForGpu() function that filters and ranks models by VRAM fit, and integrating model recommendations into the onboarding flow with visual indicators and smart defaults.

Changes

Cohort / File(s)	Summary
GPU Detection & Recommendation `bin/lib/nim.js`	Refactored `detectGpu()` to accept options for dependency injection (`runCapture`, `platform`), added GPU `name` field for NVIDIA GPUs and DGX Spark GB10 detection, changed macOS `cores` handling to conditional spread, and introduced new `suggestModelsForGpu(gpu)` function that filters models by VRAM, sorts descending, marks one model as recommended (≤90% VRAM utilization), and exports both functions. JSDoc type annotations added for multiple functions.
Onboarding Integration `bin/lib/onboard.js`	Updated GPU logging to conditionally include GPU model name, replaced direct model filtering with `suggestModelsForGpu()` call, added `(recommended)` tags in interactive model selection display, and modified default model selection logic to use the first recommended model index in both interactive and non-interactive modes instead of always defaulting to first model.
Test Coverage `test/nim.test.js`	Added comprehensive test suites for `detectGpu(opts)` covering NVIDIA detection with name extraction, multi-GPU parsing, DGX Spark GB10 detection (including `free -m` fallback), macOS discrete and unified memory detection, and error handling; added test suite for `suggestModelsForGpu()` covering null/non-capable GPU handling, VRAM-based filtering and sorting, and recommended flag assignment constraints.

Sequence Diagram

sequenceDiagram
    participant User
    participant Onboarding as Onboarding Flow
    participant GpuDetect as detectGpu(opts)
    participant ModelSuggest as suggestModelsForGpu()
    participant ModelDB as Model Database

    User->>Onboarding: Start NIM setup
    Onboarding->>GpuDetect: Detect GPU (with injected runCapture)
    GpuDetect->>GpuDetect: Query nvidia-smi for name & memory
    GpuDetect-->>Onboarding: {totalMemoryMB, nimCapable, name?, spark?}
    
    Onboarding->>ModelSuggest: suggestModelsForGpu(gpu)
    ModelSuggest->>ModelDB: Fetch available NIM models
    ModelSuggest->>ModelSuggest: Filter by minGpuMemoryMB ≤ totalMemoryMB
    ModelSuggest->>ModelSuggest: Sort by memory descending
    ModelSuggest->>ModelSuggest: Mark first ≤90% fit as recommended
    ModelSuggest-->>Onboarding: [models] with recommended flags
    
    alt Interactive Mode
        Onboarding->>User: Show models with (recommended) tags
        User->>Onboarding: Select model or use default
    else Non-Interactive Mode
        Onboarding->>Onboarding: Auto-select recommended model
    end
    
    Onboarding-->>User: Model selected & NIM starts

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 With twitching whiskers, I hop through VRAM,
Recommending models—"This one's gram!"
No more mismatches, no crashing sprees,
Just perfect fits floating through memory trees ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main feature: recommending optimal NIM models based on detected GPU VRAM.
Linked Issues check	✅ Passed	All linked issue `#270` requirements are met: suggestModelsForGpu() implemented, GPU name surfaced, model recommendation integrated into onboarding, and comprehensive tests added.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with the linked issue objectives; no unrelated modifications detected outside the scope of GPU-based model recommendation.
Docstring Coverage	✅ Passed	Docstring coverage is 84.62% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

test/nim.test.js (1)

200-246: Consider adding an explicit “no recommended model” edge-case test.

You already verify the positive recommendation path well; adding one case where all fitting models exceed 90% VRAM would lock in fallback semantics and prevent future regressions.

Suggested test addition

   it("recommended model fits within 90% VRAM", () => {
     const vram = 32000;
     const models = nim.suggestModelsForGpu({ totalMemoryMB: vram, nimCapable: true });
     const rec = models.find((m) => m.recommended);
     if (rec) {
       assert.ok(rec.minGpuMemoryMB <= vram * 0.9,
         `recommended model (${rec.minGpuMemoryMB} MB) should fit within 90% of ${vram} MB`);
     }
   });

+  it("can return no recommended model when all fitting models exceed 90% VRAM", () => {
+    const models = nim.suggestModelsForGpu({ totalMemoryMB: 8000, nimCapable: true });
+    const recommended = models.filter((m) => m.recommended);
+    assert.equal(recommended.length, 0);
+  });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/nim.test.js` around lines 200 - 246, Add a test that verifies the edge
case where no model should be marked recommended: call nim.suggestModelsForGpu
with nimCapable: true and a totalMemoryMB value chosen so that the function
returns one or more fitting models but every returned model has minGpuMemoryMB >
0.9 * totalMemoryMB, then assert that models.filter(m => m.recommended).length
=== 0 (and optionally assert the returned list is non-empty) to lock in the
fallback semantics when no candidate fits within 90% VRAM; reference the tested
function nim.suggestModelsForGpu and the returned model fields minGpuMemoryMB
and recommended when adding this test.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/nim.test.js`:
- Around line 200-246: Add a test that verifies the edge case where no model
should be marked recommended: call nim.suggestModelsForGpu with nimCapable: true
and a totalMemoryMB value chosen so that the function returns one or more
fitting models but every returned model has minGpuMemoryMB > 0.9 *
totalMemoryMB, then assert that models.filter(m => m.recommended).length === 0
(and optionally assert the returned list is non-empty) to lock in the fallback
semantics when no candidate fits within 90% VRAM; reference the tested function
nim.suggestModelsForGpu and the returned model fields minGpuMemoryMB and
recommended when adding this test.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3c8e8f6a-6cf9-47bc-8a66-b4c7638b79ae

📥 Commits

Reviewing files that changed from the base of the PR and between 7d4c8e4 and 2741afd.

📒 Files selected for processing (3)

bin/lib/nim.js
bin/lib/onboard.js
test/nim.test.js

brianwtaylor · 2026-03-21T02:37:16Z

Consolidated into #537.

wscurran · 2026-03-23T21:00:22Z

Thanks for adding the suggestModelsForGpu function to nim.js and integrating model suggestions into the onboarding flow, which should improve the user experience by defaulting to the recommended model.

feat: recommend optimal NIM model based on detected GPU VRAM

2741afd

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

brianwtaylor closed this Mar 21, 2026

wscurran added enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). priority: medium Issue that should be addressed in upcoming releases labels Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: recommend optimal NIM model based on detected GPU VRAM#542

feat: recommend optimal NIM model based on detected GPU VRAM#542
brianwtaylor wants to merge 1 commit intoNVIDIA:mainfrom
brianwtaylor:feat/gpu-model-preselector

brianwtaylor commented Mar 21, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 21, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

brianwtaylor commented Mar 21, 2026

Uh oh!

wscurran commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brianwtaylor commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Test plan

Automated Tests

Uh oh!

coderabbitai bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

brianwtaylor commented Mar 21, 2026

Uh oh!

wscurran commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brianwtaylor commented Mar 21, 2026 •

edited

Loading

coderabbitai bot commented Mar 21, 2026 •

edited

Loading