Skip to content

Rendering correctness test determinism#5353

Merged
huidongc merged 3 commits into
isaac-sim:developfrom
huidongc:rendering-correctness-test-determinism
Apr 22, 2026
Merged

Rendering correctness test determinism#5353
huidongc merged 3 commits into
isaac-sim:developfrom
huidongc:rendering-correctness-test-determinism

Conversation

@huidongc
Copy link
Copy Markdown
Collaborator

@huidongc huidongc commented Apr 22, 2026

Description

For the rendering correctness test, we don't really need the env.reset() call to fill the camera output buffers. Instead, if we remove env.reset(), the camera output buffers will be filled on the invocation of camera.data:

    @property
    def data(self) -> CameraData:
        # update sensors if needed
        self._update_outdated_buffers()
        # return the data
        return self._data

This means we can remove env.reset() calls in the rendering correctness test to avoid non-deterministic initial pose. Articulation bodies will be at their default pose for rendering for all combos. With this removal I can set the max pixel diff threshold to smaller (stricter) values:

    "cartpole": 1.0,      # decreased from 2.0
    "shadow_hand": 3.0,   # decreased from 8.0
    "dexsuite_kuka": 4.0, # decreased from 10.0

The test will become more sensitive to capture rendering changes but hopefully it can still tolerate minor pixel noise.

Fixes # (issue)

Type of change

  • Test-only change

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label Apr 22, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 22, 2026

Greptile Summary

This PR improves rendering correctness test determinism by removing the per-fixture env_cfg.seed and env.reset(seed=_ENV_SEED) calls (which introduced non-determinism by perturbing state before capture), centralizing per-environment pixel-difference thresholds in _MAX_DIFFERENT_PIXELS_PERCENTAGE_BY_ENV_NAME, and regenerating the golden images under the new deterministic baseline. The registered-task list is refactored from a flat list of strings to a list of (task_id, env_name) tuples so each task can pick the correct threshold.

Confidence Score: 5/5

Safe to merge — changes are test-only and tighten correctness thresholds under a more deterministic baseline.

All findings are P2 style suggestions. The core logic change (removing seed/reset calls and centralising thresholds) is sound and well-motivated. Golden images have been regenerated. No production code is affected.

No files require special attention; all changes are confined to the test suite and its golden images.

Important Files Changed

Filename Overview
source/isaaclab_tasks/test/test_rendering_correctness.py Removes env seed and reset calls from all fixtures; centralizes pixel-diff thresholds via _MAX_DIFFERENT_PIXELS_PERCENTAGE_BY_ENV_NAME; refactors _RENDER_CORRECTNESS_TASK_IDS to tuples; tightens thresholds. Minor KeyError risk if a new env_name tuple is added without updating the threshold dict.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Test fixture created\ne.g. cartpole_env] --> B[env constructed\nno seed / no reset]
    B --> C[Camera.data.output accessed\nbuffers updated on access]
    C --> D[_validate_camera_outputs called\nwith env_name threshold lookup]
    D --> E{env_name in\n_MAX_DIFFERENT_PIXELS_PERCENTAGE_BY_ENV_NAME?}
    E -- Yes --> F[Compare against golden image\nusing per-env threshold]
    E -- No --> G[KeyError at runtime]
    F --> H{diff_pct <= threshold?}
    H -- Pass --> I[Test passes]
    H -- Fail --> J[pytest.fail with diff details]
Loading

Reviews (2): Last reviewed commit: "smaller (stricter) threshold" | Re-trigger Greptile

Comment on lines 792 to 793
camera_outputs_nested_dict = _collect_camera_outputs(env)
num_camera_outputs = len(camera_outputs_nested_dict)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Camera data may be empty without a prior reset

gym.make() constructs the environment but, per the Gymnasium contract, sensor/observation data is not guaranteed to be populated until the first reset() call. If Isaac Sim defers filling camera buffers to reset(), then _collect_camera_outputs will return an empty dict and the assertion on line 794 (assert num_camera_outputs == 1) will fail with a confusing message rather than indicating the root cause.

The same pattern was removed from the other three env fixtures (shadow_hand_env, cartpole_env, dexsuite_kuka_allegro_lift_env). If those fixtures also rely on camera data being populated at construction time, any future environment that defers rendering until reset() will silently break here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine. we will check whether camera outputs exist and contain non-zero pixels inside _validate_camera_outputs.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Camera output buffers will be updated on invocation of Camera.data():

    @property
    def data(self) -> CameraData:
        # update sensors if needed
        self._update_outdated_buffers()
        # return the data
        return self._data

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for clarifying! I see now that _validate_camera_outputs has the assertion assert len(camera_outputs) > 0 plus the per-tensor non-zero check, so empty or blank outputs will fail clearly rather than silently passing. That addresses my concern. 👍

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarification! You're right — the Camera.data property's lazy update via _update_outdated_buffers() ensures buffers are populated on access, making the explicit reset() unnecessary for camera data retrieval in this context.

This is a nice pattern that decouples buffer population from the reset lifecycle. My concern about the Gymnasium contract was misplaced here since Isaac Lab's camera sensor implementation handles this internally. 👍

Comment thread source/isaaclab_tasks/test/test_rendering_correctness.py Outdated
@huidongc huidongc marked this pull request as draft April 22, 2026 06:17
@huidongc huidongc force-pushed the rendering-correctness-test-determinism branch from 55d512c to c308e1b Compare April 22, 2026 08:27
# needs to be large enough to tolerate minor rendering noise while small enough to catch unexpected changes.
_MAX_DIFFERENT_PIXELS_PERCENTAGE_BY_ENV_NAME = {
"cartpole": 1.0,
"shadow_hand": 3.5,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was 8.0 before

# The value is set case by case based on the screen space taken up by the env in camera output images. It
# needs to be large enough to tolerate minor rendering noise while small enough to catch unexpected changes.
_MAX_DIFFERENT_PIXELS_PERCENTAGE_BY_ENV_NAME = {
"cartpole": 1.0,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was 2.0 before

_MAX_DIFFERENT_PIXELS_PERCENTAGE_BY_ENV_NAME = {
"cartpole": 1.0,
"shadow_hand": 3.5,
"dexsuite_kuka": 4.5,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was 10.0 before

@huidongc huidongc marked this pull request as ready for review April 22, 2026 09:15
@huidongc huidongc requested a review from pbarejko April 22, 2026 10:10
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR removes env.reset() calls from rendering correctness tests to eliminate non-deterministic initial articulation poses. Instead, camera buffers are filled via the lazy _update_outdated_buffers() mechanism when accessing camera.data. This allows tighter pixel difference thresholds (cartpole: 2.0→1.0, shadow_hand: 8.0→3.0, dexsuite_kuka: 10.0→4.0). The implementation is correct and well-reasoned.

Architecture Impact

Changes are self-contained to the test file. No cross-module impact — the _MAX_DIFFERENT_PIXELS_PERCENTAGE_BY_ENV_NAME dictionary centralizes thresholds and is only consumed within this test module.

Implementation Verdict

Ship it — Clean implementation with good code organization. The approach of relying on default articulation poses instead of randomized reset states is sound for rendering correctness tests.

Test Coverage

This PR is the test. The changes improve test reliability by removing a source of non-determinism. The lowered thresholds make the tests more sensitive to actual rendering regressions while the removal of env.reset() eliminates false positives from initial pose variance. The 71 updated golden images reflect the new deterministic default poses.

CI Status

CI shows 3 test failures, but none are caused by this PR:

  • Isaac-Cartpole-RGB-TheiaTiny-v0 failures in test_environments.py and test_environments_with_stage_in_memory.py — these are Theia vision model tests, unrelated to rendering correctness tests
  • Training benchmark failures (duration/reward thresholds) — flaky performance benchmarks, not related to this PR

The rendering correctness tests themselves (shard 3/3) passed ✅

Findings

🔵 Improvement: test_rendering_correctness.py:775-777 — Consider documenting the env_name mapping rationale
The _RENDER_CORRECTNESS_TASK_IDS list now includes env_name mappings (e.g., Isaac-Repose-Cube-Shadow-Vision-Direct-v0 maps to shadow_hand). A brief comment explaining why Shadow Hand is used for the Repose-Cube task would help future maintainers understand this is based on screen space coverage, not task similarity.

Overall this is a well-thought-out improvement that makes the tests more deterministic and the thresholds more meaningful. The code organization with the centralized threshold dictionary is cleaner than the previous inline values.

@huidongc huidongc merged commit 481d7ee into isaac-sim:develop Apr 22, 2026
28 of 31 checks passed
@huidongc huidongc deleted the rendering-correctness-test-determinism branch April 22, 2026 22:27
bdilinila pushed a commit to bdilinila/IsaacLab that referenced this pull request Apr 29, 2026
# Description

For the rendering correctness test, we don't really need the
`env.reset()` call to fill the camera output buffers. Instead, if we
remove `env.reset()`, the camera output buffers will be filled on the
invocation of camera.data:
```python
    @Property
    def data(self) -> CameraData:
        # update sensors if needed
        self._update_outdated_buffers()
        # return the data
        return self._data
```

This means we can remove `env.reset()` calls in the rendering
correctness test to avoid non-deterministic initial pose. Articulation
bodies will be at their default pose for rendering for all combos. With
this removal I can set the max pixel diff threshold to smaller
(stricter) values:
```python
    "cartpole": 1.0,      # decreased from 2.0
    "shadow_hand": 3.0,   # decreased from 8.0
    "dexsuite_kuka": 4.0, # decreased from 10.0
```

The test will become more sensitive to capture rendering changes but
hopefully it can still tolerate minor pixel noise.

Fixes # (issue)

<!-- As a practice, it is recommended to open an issue to have
discussions on the proposed pull request.
This makes it easier for the community to keep track of what is being
developed or added, and if a given feature
is demanded by more than one party. -->

## Type of change

- Test-only change


## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants