[BUG FIX] vis/rasterizer: defer OffscreenRenderer creation until first render by gpinkert · Pull Request #31 · ROCm/Genesis

gpinkert · 2026-04-25T06:32:48Z

Previously, Rasterizer.build() eagerly constructed a pyrender.OffscreenRenderer whenever no interactive viewer was active, which is the path taken by every scene.build() call with show_viewer=False. Constructing the OffscreenRenderer immediately calls EGLPlatform.init_context() (genesis/ext/pyrender/platforms/egl.py), which in turn calls eglInitialize on a Mesa/radeonsi display.

This is fine when a single test process drives the GPU sequentially, but it has two problems on the AMD/ROCm test setup:

It creates an EGL/GL context for every scene.build(), even when the test never renders. The vast majority of the rigid-physics test suite never instantiates a camera or calls scene.render() / cam.render(), so the GL context, FBO, depth buffer, and shader compiles are pure per-test overhead.
When the test runner uses pytest-xdist with multiple workers on a single AMD GPU, the concurrent eglInitialize / radeonsi context creations contend on the driver and reliably fail with:
```
radeonsi: error: can't create eop_bug_scratch
radeonsi: error: Failed to create a context.
```
followed by a SIGSEGV inside eglInitialize (egl.py:223). The crash manifests under any combination of -n N and -n N --forked because pytest-forked runs each test inside a fresh fork that re-enters scene.build() and races against the other workers' GL initialisations.

The fix moves the OffscreenRenderer construction out of build() and into a new _ensure_renderer() helper that is invoked on the first render_camera() call. With this change:

Tests that never render (the dominant case) never touch EGL, so they cannot hit the radeonsi context-creation race. This unblocks pytest -n N parallel execution on a single AMD GPU.
Tests that do render are unaffected behaviourally: the same OffscreenRenderer is created the first time render_camera() runs, and reused on subsequent calls. make_current() / make_uncurrent(), resize handling, and destroy() all already null-check self._renderer, so no further refactor was required.

The visualizer/rasterizer wiring is otherwise unchanged: cameras are still added in add_camera(), pyrender.Renderer (per-camera target) is still allocated up-front (it only touches the JIT context, not GL), and destroy() continues to clean up whichever renderer was actually materialised.

Result: pytest tests/test_rigid_physics.py -n 8 -m required now runs to completion on a single-GPU AMD/ROCm host. End-to-end wall-time on the required set drops from ~30 minutes (sequential -n 0 --forked) to roughly 8 minutes, with the speedup limited by per-process kernel compilation rather than EGL or GPU contention.

…t render Previously, `Rasterizer.build()` eagerly constructed a `pyrender.OffscreenRenderer` whenever no interactive viewer was active, which is the path taken by every `scene.build()` call with `show_viewer=False`. Constructing the OffscreenRenderer immediately calls `EGLPlatform.init_context()` (genesis/ext/pyrender/platforms/egl.py), which in turn calls `eglInitialize` on a Mesa/`radeonsi` display. This is fine when a single test process drives the GPU sequentially, but it has two problems on the AMD/ROCm test setup: 1. It creates an EGL/GL context for *every* `scene.build()`, even when the test never renders. The vast majority of the rigid-physics test suite never instantiates a camera or calls `scene.render()` / `cam.render()`, so the GL context, FBO, depth buffer, and shader compiles are pure per-test overhead. 2. When the test runner uses `pytest-xdist` with multiple workers on a single AMD GPU, the concurrent `eglInitialize` / `radeonsi` context creations contend on the driver and reliably fail with: radeonsi: error: can't create eop_bug_scratch radeonsi: error: Failed to create a context. followed by a SIGSEGV inside `eglInitialize` (egl.py:223). The crash manifests under any combination of `-n N` and `-n N --forked` because `pytest-forked` runs each test inside a fresh fork that re-enters `scene.build()` and races against the other workers' GL initialisations. The fix moves the OffscreenRenderer construction out of `build()` and into a new `_ensure_renderer()` helper that is invoked on the first `render_camera()` call. With this change: - Tests that never render (the dominant case) never touch EGL, so they cannot hit the `radeonsi` context-creation race. This unblocks `pytest -n N` parallel execution on a single AMD GPU. - Tests that do render are unaffected behaviourally: the same `OffscreenRenderer` is created the first time `render_camera()` runs, and reused on subsequent calls. `make_current()` / `make_uncurrent()`, resize handling, and `destroy()` all already null-check `self._renderer`, so no further refactor was required. The visualizer/rasterizer wiring is otherwise unchanged: cameras are still added in `add_camera()`, `pyrender.Renderer` (per-camera target) is still allocated up-front (it only touches the JIT context, not GL), and `destroy()` continues to clean up whichever renderer was actually materialised. Result: `pytest tests/test_rigid_physics.py -n 8 -m required` now runs to completion on a single-GPU AMD/ROCm host. End-to-end wall-time on the required set drops from ~30 minutes (sequential `-n 0 --forked`) to roughly 8 minutes, with the speedup limited by per-process kernel compilation rather than EGL or GPU contention.

The CI tester scripts that invoke the Genesis test suite live in another repo and pass `-n 0` (sequential) explicitly. Now that EGL is initialised lazily and concurrent `scene.build()` calls no longer race on `radeonsi` context creation (see prior commit), it is safe — and substantially faster — to run the suite in parallel on a single AMD GPU. This change rewrites `-n 0` to `-n 8` from inside the existing `pytest_cmdline_main` hook so the parallel default takes effect without any modification to the external tester scripts. The override is gated on: * `os.path.exists("/dev/kfd")` — the AMDGPU kernel driver device file, so non-AMD hosts (NVIDIA, Apple, CPU-only) keep the user-supplied value. * `not show_viewer` — the immediately-preceding block already pins `numprocesses` to 0 when the interactive viewer is requested, and that decision must win. * `config.option.numprocesses == 0` — explicit `-n N` for any non-zero `N` is preserved verbatim, so debugging with `-n 1` or experimenting with other worker counts still works. `pytest-xdist` is already installed in `genesis:amd-integration` (via the `[dev]` extras pulled in by `Dockerfile.rocm`), and the existing `-n 0` invocations already depend on it being present, so no packaging changes are needed for this hook to take effect. Net effect: `pytest -n 0 --forked` issued from the upstream tester scripts now runs with eight workers on AMD/ROCm, dropping the required-test wall-time on `tests/test_rigid_physics.py` from roughly 30 minutes to under 10 minutes on a single-GPU host.

yaoliu13 · 2026-04-25T14:46:37Z

/run-ci

yaoliu13

This PR is not based on the latest amd-integration: lazy-offscreen-renderer...ROCm:Genesis:amd-integration

gpinkert added 2 commits April 25, 2026 02:28

yaoliu13 requested changes Apr 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG FIX] vis/rasterizer: defer OffscreenRenderer creation until first render#31

[BUG FIX] vis/rasterizer: defer OffscreenRenderer creation until first render#31
gpinkert wants to merge 2 commits intoamd-integrationfrom
lazy-offscreen-renderer

gpinkert commented Apr 25, 2026 •

edited by diptorupd

Loading

Uh oh!

yaoliu13 commented Apr 25, 2026

Uh oh!

yaoliu13 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gpinkert commented Apr 25, 2026 • edited by diptorupd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yaoliu13 commented Apr 25, 2026

Uh oh!

yaoliu13 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gpinkert commented Apr 25, 2026 •

edited by diptorupd

Loading