Skip to content

fix: onboard fails on GPUs with insufficient VRAM for local NIM#836

Merged
cv merged 2 commits intoNVIDIA:mainfrom
CalebDeLeeuwMisfits:fix/nimcapable-vram-check
Mar 30, 2026
Merged

fix: onboard fails on GPUs with insufficient VRAM for local NIM#836
cv merged 2 commits intoNVIDIA:mainfrom
CalebDeLeeuwMisfits:fix/nimcapable-vram-check

Conversation

@CalebDeLeeuwMisfits
Copy link
Copy Markdown
Contributor

@CalebDeLeeuwMisfits CalebDeLeeuwMisfits commented Mar 24, 2026

Problem

nemoclaw onboard fails at sandbox creation with:

GPU sandbox requested, but the active gateway has no allocatable GPUs.

This happens because detectGpu() sets nimCapable: true for any NVIDIA GPU, regardless of VRAM. The --gpu flag is then passed to both openshell gateway start and openshell sandbox create — even when no NIM model fits in the available VRAM and the user selects cloud inference.

The sandbox silently fails to create (the error is piped through awk), but the CLI reports success. Then step 7 (policy presets) fails with "sandbox not found", leaving the user stuck.

Affects all consumer NVIDIA GPUs with <40GB VRAM (RTX 3060, 4060, 4070, etc.) and WSL2 environments where GPU passthrough to nested k3s containers is unavailable.

Fix

bin/lib/nim.jsnimCapable now checks whether at least one NIM model in nim-images.json has minGpuMemoryMB <= totalMemoryMB:

const canRunNim = nimImages.models.some((m) => m.minGpuMemoryMB <= totalMemoryMB);

bin/lib/onboard.js — Added an informational message during preflight when GPU is detected but too small for local NIM:

✓ NVIDIA GPU detected: 1 GPU(s), 8188 MB VRAM
ⓘ GPU VRAM too small for local NIM — will use cloud inference

Testing

Tested on WSL2 (Ubuntu) with an 8GB VRAM RTX GPU where the smallest NIM model requires 8192 MB.

Before: Onboard fails every time at step 3 — sandbox never created, policies can't apply.
After: Onboard completes successfully — gateway starts without --gpu, sandbox creates as CPU-only, cloud API inference works, agent responds.

Summary by CodeRabbit

  • Bug Fixes

    • GPU capability detection now correctly validates VRAM requirements against model specifications, rather than assuming any detected GPU is suitable for local execution.
  • New Features

    • Informational message now displays when a GPU lacks sufficient VRAM, notifying users that cloud inference will be used instead.

…Capable

detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU,
even when no NIM model fits in the available VRAM. This caused
onboard to pass --gpu to gateway and sandbox creation, which fails
on systems where GPU passthrough is unavailable (e.g. WSL2) or
the GPU has insufficient memory for any local model.

Now nimCapable is only true when at least one NIM model in
nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false,
onboard skips --gpu flags and auto-selects cloud inference,
with a clear message explaining why.

Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model
requires 8192MB — onboard now completes successfully using
NVIDIA Cloud API.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7803fc0a-8b62-4471-996c-d22962b6cf7e

📥 Commits

Reviewing files that changed from the base of the PR and between bc509b9 and 2b5603d.

📒 Files selected for processing (2)
  • bin/lib/nim.js
  • bin/lib/onboard.js

📝 Walkthrough

Walkthrough

Updated GPU capability detection in NVIDIA path to conditionally set nimCapable based on whether available VRAM meets model requirements. Added informational logging in preflight flow when local NIM is unsuitable due to insufficient VRAM.

Changes

Cohort / File(s) Summary
GPU Capability Detection
bin/lib/nim.js
Modified detectGpu() NVIDIA path to compute nimCapable by checking if at least one model in nimImages.models has sufficient VRAM (minGpuMemoryMB <= totalMemoryMB), replacing unconditional true assignment.
Preflight Logging
bin/lib/onboard.js
Added informational log emission in GPU detection flow when gpu.nimCapable is false, notifying that local NIM is unsuitable due to insufficient VRAM and cloud inference will be used.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 GPU memory checked with care,
VRAM requirements laid bare,
When models won't fit locally tight,
Cloud inference shines so bright!
A rabbit's code, precise and fair ✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@cv cv merged commit 805a958 into NVIDIA:main Mar 30, 2026
1 check was pending
quanticsoul4772 pushed a commit to quanticsoul4772/NemoClaw that referenced this pull request Mar 30, 2026
…Capable (NVIDIA#836)

detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU,
even when no NIM model fits in the available VRAM. This caused
onboard to pass --gpu to gateway and sandbox creation, which fails
on systems where GPU passthrough is unavailable (e.g. WSL2) or
the GPU has insufficient memory for any local model.

Now nimCapable is only true when at least one NIM model in
nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false,
onboard skips --gpu flags and auto-selects cloud inference,
with a clear message explaining why.

Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model
requires 8192MB — onboard now completes successfully using
NVIDIA Cloud API.

Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
realkim93 added a commit to realkim93/NemoClaw that referenced this pull request Apr 1, 2026
Merge origin/main to resolve conflicts from recent changes:
- NVIDIA#1208 core blocker lifecycle regressions
- NVIDIA#1200 Prettier formatting
- NVIDIA#836 GPU VRAM checks

Jetson detection now leverages main's UNIFIED_MEMORY_GPU_TAGS
(Orin/Thor/Xavier) with added jetson flag and /proc/device-tree
fallback. All 118 tests pass.
realkim93 added a commit to realkim93/NemoClaw that referenced this pull request Apr 1, 2026
Merge origin/main into feat/jetson-orin-nano-support to resolve
conflicts from recent changes (NVIDIA#1208, NVIDIA#1200, NVIDIA#836, NVIDIA#1221, NVIDIA#1223).

Jetson detection now leverages main's UNIFIED_MEMORY_GPU_TAGS
with added jetson flag and /proc/device-tree fallback.

All 116 tests pass.
laitingsheng pushed a commit that referenced this pull request Apr 2, 2026
…Capable (#836)

detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU,
even when no NIM model fits in the available VRAM. This caused
onboard to pass --gpu to gateway and sandbox creation, which fails
on systems where GPU passthrough is unavailable (e.g. WSL2) or
the GPU has insufficient memory for any local model.

Now nimCapable is only true when at least one NIM model in
nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false,
onboard skips --gpu flags and auto-selects cloud inference,
with a clear message explaining why.

Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model
requires 8192MB — onboard now completes successfully using
NVIDIA Cloud API.

Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
lakamsani pushed a commit to lakamsani/NemoClaw that referenced this pull request Apr 4, 2026
…Capable (NVIDIA#836)

detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU,
even when no NIM model fits in the available VRAM. This caused
onboard to pass --gpu to gateway and sandbox creation, which fails
on systems where GPU passthrough is unavailable (e.g. WSL2) or
the GPU has insufficient memory for any local model.

Now nimCapable is only true when at least one NIM model in
nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false,
onboard skips --gpu flags and auto-selects cloud inference,
with a clear message explaining why.

Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model
requires 8192MB — onboard now completes successfully using
NVIDIA Cloud API.

Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
gemini2026 pushed a commit to gemini2026/NemoClaw that referenced this pull request Apr 14, 2026
…Capable (NVIDIA#836)

detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU,
even when no NIM model fits in the available VRAM. This caused
onboard to pass --gpu to gateway and sandbox creation, which fails
on systems where GPU passthrough is unavailable (e.g. WSL2) or
the GPU has insufficient memory for any local model.

Now nimCapable is only true when at least one NIM model in
nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false,
onboard skips --gpu flags and auto-selects cloud inference,
with a clear message explaining why.

Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model
requires 8192MB — onboard now completes successfully using
NVIDIA Cloud API.

Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants