feat: recommend optimal NIM model based on detected GPU VRAM#542
feat: recommend optimal NIM model based on detected GPU VRAM#542brianwtaylor wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
📝 WalkthroughWalkthroughThis PR introduces GPU model recommendation by refactoring GPU detection to support dependency injection, adding GPU name enrichment for NVIDIA devices, creating a new Changes
Sequence DiagramsequenceDiagram
participant User
participant Onboarding as Onboarding Flow
participant GpuDetect as detectGpu(opts)
participant ModelSuggest as suggestModelsForGpu()
participant ModelDB as Model Database
User->>Onboarding: Start NIM setup
Onboarding->>GpuDetect: Detect GPU (with injected runCapture)
GpuDetect->>GpuDetect: Query nvidia-smi for name & memory
GpuDetect-->>Onboarding: {totalMemoryMB, nimCapable, name?, spark?}
Onboarding->>ModelSuggest: suggestModelsForGpu(gpu)
ModelSuggest->>ModelDB: Fetch available NIM models
ModelSuggest->>ModelSuggest: Filter by minGpuMemoryMB ≤ totalMemoryMB
ModelSuggest->>ModelSuggest: Sort by memory descending
ModelSuggest->>ModelSuggest: Mark first ≤90% fit as recommended
ModelSuggest-->>Onboarding: [models] with recommended flags
alt Interactive Mode
Onboarding->>User: Show models with (recommended) tags
User->>Onboarding: Select model or use default
else Non-Interactive Mode
Onboarding->>Onboarding: Auto-select recommended model
end
Onboarding-->>User: Model selected & NIM starts
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/nim.test.js (1)
200-246: Consider adding an explicit “no recommended model” edge-case test.You already verify the positive recommendation path well; adding one case where all fitting models exceed 90% VRAM would lock in fallback semantics and prevent future regressions.
Suggested test addition
it("recommended model fits within 90% VRAM", () => { const vram = 32000; const models = nim.suggestModelsForGpu({ totalMemoryMB: vram, nimCapable: true }); const rec = models.find((m) => m.recommended); if (rec) { assert.ok(rec.minGpuMemoryMB <= vram * 0.9, `recommended model (${rec.minGpuMemoryMB} MB) should fit within 90% of ${vram} MB`); } }); + it("can return no recommended model when all fitting models exceed 90% VRAM", () => { + const models = nim.suggestModelsForGpu({ totalMemoryMB: 8000, nimCapable: true }); + const recommended = models.filter((m) => m.recommended); + assert.equal(recommended.length, 0); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/nim.test.js` around lines 200 - 246, Add a test that verifies the edge case where no model should be marked recommended: call nim.suggestModelsForGpu with nimCapable: true and a totalMemoryMB value chosen so that the function returns one or more fitting models but every returned model has minGpuMemoryMB > 0.9 * totalMemoryMB, then assert that models.filter(m => m.recommended).length === 0 (and optionally assert the returned list is non-empty) to lock in the fallback semantics when no candidate fits within 90% VRAM; reference the tested function nim.suggestModelsForGpu and the returned model fields minGpuMemoryMB and recommended when adding this test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/nim.test.js`:
- Around line 200-246: Add a test that verifies the edge case where no model
should be marked recommended: call nim.suggestModelsForGpu with nimCapable: true
and a totalMemoryMB value chosen so that the function returns one or more
fitting models but every returned model has minGpuMemoryMB > 0.9 *
totalMemoryMB, then assert that models.filter(m => m.recommended).length === 0
(and optionally assert the returned list is non-empty) to lock in the fallback
semantics when no candidate fits within 90% VRAM; reference the tested function
nim.suggestModelsForGpu and the returned model fields minGpuMemoryMB and
recommended when adding this test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3c8e8f6a-6cf9-47bc-8a66-b4c7638b79ae
📒 Files selected for processing (3)
bin/lib/nim.jsbin/lib/onboard.jstest/nim.test.js
|
Consolidated into #537. |
|
Thanks for adding the |
Summary
suggestModelsForGpu(gpu)to nim.js — ranks NIM models by fit for detected GPU, marks the largest model using ≤90% VRAM as recommendeddetectGpu(opts)) for testabilityMotivation
Users frequently select NIM models too large for their GPU, causing OOM failures during startup. This is especially common on DGX Spark where the unified memory architecture makes VRAM limits non-obvious.
Related to #404 — Jetson Orin Nano
detectGpu()returns null on unified memory; the DI refactor and fallback path improvements here help address thisRelated to #511 — Jetson AGX (aarch64) installation failures include GPU detection as a contributing factor
Test plan
Automated Tests