Skip to content

[debug] Instrument ViewerGL.get_frame for fully-black diagnosis#5540

Closed
hujc7 wants to merge 2 commits into
isaac-sim:developfrom
hujc7:jichuanh/viz-h3-instrument-getframe
Closed

[debug] Instrument ViewerGL.get_frame for fully-black diagnosis#5540
hujc7 wants to merge 2 commits into
isaac-sim:developfrom
hujc7:jichuanh/viz-h3-instrument-getframe

Conversation

@hujc7
Copy link
Copy Markdown
Collaborator

@hujc7 hujc7 commented May 8, 2026

Diagnostic-only PR. Adds conftest.py to log FBO/PBO/CUDA-GL state during the failing viewergl_rgb_motion test.

The viewergl test has been deterministically failing on develop with 'Viewer frame appears fully black' since 2026-05-07 17:45 UTC. PR #5521 hypothesis was disproven by #5539. Same Kit session: tiled_camera (Kit RTX) passes, viewergl (Newton pyglet/EGL) fails. Bug is specific to Newton ViewerGL after Kit init.

This patch adds direct CPU FBO readback (bypassing PBO/CUDA) so we can tell whether the FBO is empty (rendering didn't happen) vs the CUDA-GL interop silently zeros the readback.

DO NOT MERGE - diagnostic only.

Adds a conftest that monkey-patches Newton's ViewerGL.get_frame to log:
- env (python/warp/newton/pyglet)
- FBO id, PBO id, viewport size
- glGetError before/after
- direct CPU readback of the FBO (independent of the PBO/CUDA path)
- result of the PBO/CUDA path
@hujc7
Copy link
Copy Markdown
Collaborator Author

hujc7 commented May 8, 2026

Data captured. Diagnostic confirmed the FBO itself is empty — direct CPU glReadPixels of the FBO returns all zeros with GL_NO_ERROR before and after, identical to the PBO/CUDA path's result. Rules out CUDA-GL interop / sync / context-currency as the root cause:

[VIZDIAG] fbo=c_uint(8)  pbo=None  size=600x600
[VIZDIAG] glGetError before: GL_NO_ERROR
[VIZDIAG] CPU-readback: nonzero=0/1080000  max=0  err=GL_NO_ERROR
[VIZDIAG] PBO-result: nonzero=0/1080000  max=0

The Newton pyglet/EGL surface is attached but inert under Kit 110.1.1 — render() deposits no pixels into the FBO. Investigation continues without this PR. Skip is shipping in #5538.

@hujc7 hujc7 closed this May 8, 2026
AntoineRichard pushed a commit that referenced this pull request May 8, 2026
#5538)

## Summary

Two unrelated CI breakages on develop, bundled here so develop turns
green in one PR.

### 1. Skip the failing viewergl test

`test_cartpole_newton_visualizer_viewergl_rgb_motion[physx,newton]`
started returning all-black frames on develop after
`nvcr.io/nvidian/isaac-sim:latest-develop` flipped to a Kit 110.1.1 +
USD 25.11 base. The failure has been deterministic across multiple PRs
(#5523, #5495, #5408, …).

Investigation so far has ruled out:
- PR #5521 (revert in
#5539 still failed)
- Newton 1.0 → 1.2.0rc2 viewer code regression (only 7-line addition;
ViewerGL alone yields 1.08M nonzero pixels)
- warp 1.12 → 1.13 RegisteredGLBuffer ABI (byte-identical)
- Module-load side effects of `isaaclab_physx.renderers`
- CUDA-GL interop (PR #5540 diagnostic confirms direct CPU FBO readback
also returns zeros, with `GL_NO_ERROR`)
- GL context-currency (PR #5541 H6 attempt: still fails)
- GL/CUDA sync (PR #5542 H4 attempt: still fails)

Diagnostic output (PR #5540 v2):
```
[VIZDIAG] fbo=c_uint(8)  pbo=None  size=600x600
[VIZDIAG] glGetError before: GL_NO_ERROR
[VIZDIAG] CPU-readback: nonzero=0/1080000  max=0  err=GL_NO_ERROR
[VIZDIAG] PBO-result: nonzero=0/1080000  max=0
```

The FBO itself is empty — Newton's pyglet/EGL renderer is not depositing
pixels under Kit 110.1.1, even though `tiled_camera_rgb_non_black` (Kit
RTX path) on the same env passes. Underlying root cause still being
chased; this PR ships the skip to unblock develop.

### 2. Fix warp intersphinx 404 in docs build

`https://nvidia.github.io/warp/objects.inv` started returning 404 —
Warp's `objects.inv` only lives at `/stable/` and `/latest/` now. With
Sphinx's `warnings_treated_as_errors`, the broken intersphinx fetch
fails the docs build on every PR. Pinning to `/stable/` (matches the
existing PyTorch `/docs/2.11/` workaround pattern in the same file).

Verified `https://nvidia.github.io/warp/stable/objects.inv` returns 200.

## Test plan

- [x] CI `isaaclab_visualizers` on this branch — was passing earlier
with the skip; will re-verify with the bundled docs fix
- [ ] CI `Build Latest Docs` on this branch — must turn green (was
failing on every recent PR before this fix)

## Re-enable plan

Once the underlying viewergl bug is identified and fixed, drop the
`@pytest.mark.skip` decorator and remove the
`jichuanh-disable-viewergl-flaky.skip` fragment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant