fix: skip --gpu on WSL2 where GPU passthrough to k3s is unsupported by mattezell · Pull Request #209 · NVIDIA/NemoClaw

mattezell · 2026-03-17T14:18:50Z

Problem

nemoclaw onboard forces --gpu on both openshell gateway start and openshell sandbox create whenever nvidia-smi is detected. On WSL2 with Docker Desktop, nvidia-smi works at the host layer but the GPU cannot be passed through to the k3s cluster inside the OpenShell gateway container. The result is a dead sandbox that immediately reports "not found" on every subsequent command.

There is no --no-gpu flag, environment variable, or config option to override this.

Affected users: Everyone on WSL2 with any NVIDIA GPU.

Confirmed on:

RTX 5090 Laptop + WSL2 Ubuntu 24.04 + Docker Desktop
RTX 5070 Ti + WSL2 (NVIDIA Forums #363769)

Fixes #208

Solution

Detect WSL2 via /proc/version (which contains "microsoft" or "WSL" on WSL2 kernels) and set nimCapable: false. The existing if (gpu && gpu.nimCapable) guards in onboard.js (lines 116, 187) automatically skip --gpu. Cloud inference works normally through the OpenShell proxy.

Changes

File	Change
`bin/lib/nim.js`	Add `isWSL2()` helper; set `nimCapable: false` and `wsl2: true` on WSL2
`bin/lib/onboard.js`	Add WSL2 info message during preflight GPU detection
`scripts/setup.sh`	Add WSL2 check to legacy gateway start path
`test/wsl2.test.js`	Tests for `isWSL2()` and `detectGpu()` WSL2 awareness

What does NOT change

Native Linux with NVIDIA GPU: nimCapable stays true, no behavior change
macOS: already nimCapable: false, no behavior change
DGX Spark / DGX Station: not WSL2, no behavior change
All 13 existing tests pass unchanged

Testing

node --test test/preflight.test.js test/wsl2.test.js
# 18 pass, 0 fail

Manual verification on WSL2 Ubuntu 24.04 + RTX 5090: nemoclaw onboard completes, sandbox stays alive, openclaw tui connects via cloud inference.

On WSL2, nvidia-smi works at the host layer but the GPU cannot be passed through to the k3s cluster inside the OpenShell gateway container (Docker Desktop limitation). This causes nemoclaw onboard to create dead sandboxes that immediately report 'not found'. - Add isWSL2() detection via /proc/version in bin/lib/nim.js - Set nimCapable: false when WSL2 detected (GPU visible but unusable) - Add WSL2 info message during onboard preflight - Fix scripts/setup.sh legacy path with same WSL2 check - Add test/wsl2.test.js Tested on WSL2 Ubuntu 24.04 + Docker Desktop + RTX 5090 Laptop. No behavior change on native Linux, macOS, or DGX. Fixes NVIDIA#208 Signed-off-by: Matt Ezell <ezell.matt@gmail.com>

mattezell · 2026-03-17T14:20:40Z

Fixes: #208

mattezell · 2026-03-17T18:04:27Z

Noticed #140 describes the same WSL2 symptoms. PR #229 fixes the error-masking side (awk pipe swallowing exit codes), which is a solid improvement for all platforms. This PR fixes the upstream root cause on WSL2 specifically — preventing --gpu from being passed when the GPU can't actually reach the container runtime. The two fixes are complementary: #229 ensures failures surface clearly, this PR prevents the failure from occurring on WSL2 in the first place.

cv · 2026-03-21T19:28:44Z

Nice catch on the WSL2 GPU passthrough issue, @mattezell! That's a real pain point for folks on that platform. Just wanted to flag that the codebase has changed a fair amount since this was opened — we've introduced CI checks and landed several new features. When you get a chance, would you be able to rebase against the latest main? That'll make it much easier for us to review and merge. Thanks!

mattezell · 2026-03-25T13:21:50Z

@cv thanks. It looks like the issue I filed has been closed, with a later submitted PR having been selected as the fix, so I will go ahead and close this out.

…l inference (NVIDIA#209) * feat(inference): add sandbox-system inference route for platform-level inference Add a separate system-level inference route that the sandbox supervisor can use in-process for platform functions (e.g., embedded agent harness for policy analysis), distinct from the user-facing inference.local endpoint. The system route is accessed via an in-process API on the supervisor, ensuring userland code in the sandbox netns cannot reach it. - Extend proto with route_name fields on Set/Get inference messages - Add ResolvedRoute.name field to the router for route segregation - Server resolves both user and sandbox-system routes in bundles - Sandbox partitions routes into user/system caches on refresh - Expose InferenceContext::system_inference() in-process API - CLI --sandbox flag targets the system route on set/get/update - Integration tests using mock:// routes for the full in-process path Closes NVIDIA#207 * refactor(cli): rename --sandbox flag to --system for inference commands The --sandbox flag could be misread as targeting a user-level sandbox operation. Rename to --system to clearly indicate it configures the platform-level system inference route. * style(cli): collapse short if-else per rustfmt 1.94 --------- Co-authored-by: John Myers <johntmyers@users.noreply.github.com>

mattezell mentioned this pull request Mar 17, 2026

[BUG] nemoclaw onboard forces --gpu on WSL2, sandbox DOA (workaround included) #208

Closed

This was referenced Mar 17, 2026

[WSL2 + Docker Desktop] Sandbox not found after GPU allocation failure prevents policy setup #140

Closed

[Bug] Installer Loop: "Sandbox not found" at Step 7 and GPU allocation failure on WSL2 (Windows Home) #152

Closed

wscurran added the Platform: Windows/WSL Support for Windows Subsystem for Linux label Mar 18, 2026

mattezell closed this Mar 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: skip --gpu on WSL2 where GPU passthrough to k3s is unsupported#209

fix: skip --gpu on WSL2 where GPU passthrough to k3s is unsupported#209
mattezell wants to merge 1 commit intoNVIDIA:mainfrom
mattezell:fix/wsl2-gpu-detection

mattezell commented Mar 17, 2026 •

edited

Loading

Uh oh!

mattezell commented Mar 17, 2026

Uh oh!

mattezell commented Mar 17, 2026

Uh oh!

cv commented Mar 21, 2026

Uh oh!

mattezell commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mattezell commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

What does NOT change

Testing

Uh oh!

mattezell commented Mar 17, 2026

Uh oh!

mattezell commented Mar 17, 2026

Uh oh!

cv commented Mar 21, 2026

Uh oh!

mattezell commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mattezell commented Mar 17, 2026 •

edited

Loading