fix: onboard fails on GPUs with insufficient VRAM for local NIM#836
Merged
cv merged 2 commits intoNVIDIA:mainfrom Mar 30, 2026
Merged
fix: onboard fails on GPUs with insufficient VRAM for local NIM#836cv merged 2 commits intoNVIDIA:mainfrom
cv merged 2 commits intoNVIDIA:mainfrom
Conversation
…Capable detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU, even when no NIM model fits in the available VRAM. This caused onboard to pass --gpu to gateway and sandbox creation, which fails on systems where GPU passthrough is unavailable (e.g. WSL2) or the GPU has insufficient memory for any local model. Now nimCapable is only true when at least one NIM model in nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false, onboard skips --gpu flags and auto-selects cloud inference, with a clear message explaining why. Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model requires 8192MB — onboard now completes successfully using NVIDIA Cloud API.
2 tasks
cv
approved these changes
Mar 30, 2026
Contributor
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughUpdated GPU capability detection in NVIDIA path to conditionally set Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
quanticsoul4772
pushed a commit
to quanticsoul4772/NemoClaw
that referenced
this pull request
Mar 30, 2026
…Capable (NVIDIA#836) detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU, even when no NIM model fits in the available VRAM. This caused onboard to pass --gpu to gateway and sandbox creation, which fails on systems where GPU passthrough is unavailable (e.g. WSL2) or the GPU has insufficient memory for any local model. Now nimCapable is only true when at least one NIM model in nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false, onboard skips --gpu flags and auto-selects cloud inference, with a clear message explaining why. Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model requires 8192MB — onboard now completes successfully using NVIDIA Cloud API. Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
realkim93
added a commit
to realkim93/NemoClaw
that referenced
this pull request
Apr 1, 2026
Merge origin/main to resolve conflicts from recent changes: - NVIDIA#1208 core blocker lifecycle regressions - NVIDIA#1200 Prettier formatting - NVIDIA#836 GPU VRAM checks Jetson detection now leverages main's UNIFIED_MEMORY_GPU_TAGS (Orin/Thor/Xavier) with added jetson flag and /proc/device-tree fallback. All 118 tests pass.
realkim93
added a commit
to realkim93/NemoClaw
that referenced
this pull request
Apr 1, 2026
Merge origin/main into feat/jetson-orin-nano-support to resolve conflicts from recent changes (NVIDIA#1208, NVIDIA#1200, NVIDIA#836, NVIDIA#1221, NVIDIA#1223). Jetson detection now leverages main's UNIFIED_MEMORY_GPU_TAGS with added jetson flag and /proc/device-tree fallback. All 116 tests pass.
laitingsheng
pushed a commit
that referenced
this pull request
Apr 2, 2026
…Capable (#836) detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU, even when no NIM model fits in the available VRAM. This caused onboard to pass --gpu to gateway and sandbox creation, which fails on systems where GPU passthrough is unavailable (e.g. WSL2) or the GPU has insufficient memory for any local model. Now nimCapable is only true when at least one NIM model in nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false, onboard skips --gpu flags and auto-selects cloud inference, with a clear message explaining why. Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model requires 8192MB — onboard now completes successfully using NVIDIA Cloud API. Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
lakamsani
pushed a commit
to lakamsani/NemoClaw
that referenced
this pull request
Apr 4, 2026
…Capable (NVIDIA#836) detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU, even when no NIM model fits in the available VRAM. This caused onboard to pass --gpu to gateway and sandbox creation, which fails on systems where GPU passthrough is unavailable (e.g. WSL2) or the GPU has insufficient memory for any local model. Now nimCapable is only true when at least one NIM model in nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false, onboard skips --gpu flags and auto-selects cloud inference, with a clear message explaining why. Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model requires 8192MB — onboard now completes successfully using NVIDIA Cloud API. Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
gemini2026
pushed a commit
to gemini2026/NemoClaw
that referenced
this pull request
Apr 14, 2026
…Capable (NVIDIA#836) detectGpu() unconditionally set nimCapable=true for any NVIDIA GPU, even when no NIM model fits in the available VRAM. This caused onboard to pass --gpu to gateway and sandbox creation, which fails on systems where GPU passthrough is unavailable (e.g. WSL2) or the GPU has insufficient memory for any local model. Now nimCapable is only true when at least one NIM model in nim-images.json has minGpuMemoryMB <= totalMemoryMB. When false, onboard skips --gpu flags and auto-selects cloud inference, with a clear message explaining why. Tested on an 8GB VRAM GPU (RTX) where the smallest NIM model requires 8192MB — onboard now completes successfully using NVIDIA Cloud API. Co-authored-by: Caleb de Leeuw <cdeleeuw@users.noreply.github.com> Co-authored-by: Carlos Villela <cvillela@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
nemoclaw onboardfails at sandbox creation with:This happens because
detectGpu()setsnimCapable: truefor any NVIDIA GPU, regardless of VRAM. The--gpuflag is then passed to bothopenshell gateway startandopenshell sandbox create— even when no NIM model fits in the available VRAM and the user selects cloud inference.The sandbox silently fails to create (the error is piped through awk), but the CLI reports success. Then step 7 (policy presets) fails with "sandbox not found", leaving the user stuck.
Affects all consumer NVIDIA GPUs with <40GB VRAM (RTX 3060, 4060, 4070, etc.) and WSL2 environments where GPU passthrough to nested k3s containers is unavailable.
Fix
bin/lib/nim.js—nimCapablenow checks whether at least one NIM model innim-images.jsonhasminGpuMemoryMB <= totalMemoryMB:bin/lib/onboard.js— Added an informational message during preflight when GPU is detected but too small for local NIM:Testing
Tested on WSL2 (Ubuntu) with an 8GB VRAM RTX GPU where the smallest NIM model requires 8192 MB.
Before: Onboard fails every time at step 3 — sandbox never created, policies can't apply.
After: Onboard completes successfully — gateway starts without
--gpu, sandbox creates as CPU-only, cloud API inference works, agent responds.Summary by CodeRabbit
Bug Fixes
New Features