[Newton] Rename per-env labels in physics replication (depends on #5523)#5433
Conversation
ea456bc to
500ab8a
Compare
500ab8a to
99f7d7b
Compare
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR fixes a MuJoCo tendon duplication bug in Newton physics replication by correctly separating schema resolvers between the main builder and proto builders. The fix excludes SchemaResolverMjc from the main builder (which loads only scene-level prims) and includes it only in proto builders where register_custom_attributes is called first. The implementation is correct and well-documented, with comprehensive unit tests.
Architecture Impact
Self-contained within isaaclab_newton module. The changes affect only _build_newton_builder_from_mapping() which is called by newton_physics_replicate() and newton_visualizer_prebuild(). The fix aligns IsaacLab's cloner with Newton's expected N×T tendon entry design where SolverMuJoCo filters by tendon_world == template_world. No downstream callers need modification since the function signature is unchanged.
Implementation Verdict
Ship it — The fix is architecturally sound and correctly addresses the root cause.
Test Coverage
The PR adds 8 unit tests that cover the key invariants:
- Main builder has no MJC custom frequencies (3 tests)
- Proto builder has MJC custom frequencies registered (3 tests)
SchemaResolverMjcvalidation behavior (4 tests)- Tendon propagation with correct world indices (6 tests)
However, the tests use synthetic tendon injection rather than actually exercising the _build_newton_builder_from_mapping() function. This is acceptable given the Newton/USD dependencies would require integration tests, but there's no direct regression test for the actual fixed code path.
CI Status
No CI checks available yet — cannot verify tests pass in CI.
Findings
🔵 Improvement: source/isaaclab_newton/changelog.d/5433.rst:31-33 — Missing newline before final paragraph
The changelog file lacks a blank line before the final paragraph "The resulting N×T tendon entries..." which may cause RST rendering issues. The ^^^^^ section header style is correct but the final two lines run together with the preceding paragraph.
🔵 Improvement: source/isaaclab_newton/test/cloner/test_mjc_tendon_cloner.py:1-300 — Filename mismatch with PR description
The test file is named test_mjc_tendon_cloner.py but the PR description references test_tendon_deduplication.py. This is a documentation inconsistency, not a bug, but the PR description should be updated to match the actual filename.
🔵 Improvement: source/isaaclab_newton/test/cloner/test_mjc_tendon_cloner.py:97-98 — Accessing private Newton API
The test helper _inject_tendon_entries() accesses builder._custom_frequency_counts which is a private attribute (underscore prefix). While necessary for testing, this creates a coupling to Newton's internal implementation that may break if Newton refactors. Consider adding a comment noting this dependency.
# NOTE: Accessing private _custom_frequency_counts; may break if Newton refactors
builder._custom_frequency_counts[_TENDON_FREQ] = ...🟡 Warning: source/isaaclab_newton/isaaclab_newton/cloner/newton_replicate.py:69-73 — Proto resolver list created once, instances shared across all protos
_proto_resolvers = [SchemaResolverMjc(), SchemaResolverNewton(), SchemaResolverPhysx()]The resolver instances are created once and passed to every p.add_usd() call in the loop. If any resolver maintains mutable state that persists across calls, this could cause cross-contamination between protos. Verify Newton's resolvers are stateless, or move instantiation inside the loop:
for src_path in sources:
p = NewtonManager.create_builder(up_axis=up_axis)
solvers.SolverMuJoCo.register_custom_attributes(p)
p.add_usd(
stage,
root_path=src_path,
load_visual_shapes=True,
skip_mesh_approximation=True,
schema_resolvers=[SchemaResolverMjc(), SchemaResolverNewton(), SchemaResolverPhysx()],
)🔵 Improvement: source/isaaclab_newton/changelog.d/5433.rst — PR description mentions version bump to 0.5.16 not reflected in changed files
The PR description states "Bumps isaaclab_newton to 0.5.16" but no version file changes are included in the diff. Either the version bump should be added, or the PR description should be corrected.
🔵 Improvement: source/isaaclab_newton/test/cloner/test_mjc_tendon_cloner.py:233 — Test doesn't exercise actual fix code path
The tests verify Newton builder behavior in isolation but don't test that _build_newton_builder_from_mapping() correctly uses separate resolver lists. A minimal integration test (even if skipped in CI due to dependencies) that calls the fixed function would provide stronger regression coverage.
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
Follow-up review on commit fce4ebdf. The diff shows the same implementation as the previous review - no new changes since the last "Ship it" verdict. The dependency bumps (Newton v1.2.0rc1, warp-lang>=1.13.0, mujoco 3.8.0), the _rename_builder_labels fix with boundary-terminated prefix matching, the wp.transform_to_matrix API update, and the comprehensive test suite are all unchanged.
CI Status
Most jobs still pending. Pre-commit passed ✅. The "Check changelog fragments" failure needs attention - verify the changelog fragment format meets project requirements (the .rst extension and location look correct, but the CI may have specific validation rules).
Implementation Verdict
Ship it — Code is ready. Address the changelog fragment CI failure (likely a formatting or validation issue) and merge once remaining CI jobs pass.
fce4ebd to
6b09354
Compare
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
Follow-up review on commit 6b093544. No code changes since the previous "Ship it" verdict - the diff is identical. The changelog fragment CI failure from the previous review remains and needs to be addressed.
CI Status
The "Check changelog fragments" job is still failing ❌. The fragment file exists at source/isaaclab_newton/changelog.d/jichuanh-newton-replicate-tendon-fix.rst but may have a formatting issue - the file appears to be missing a trailing newline (the last line cross-contaminate during the rename. has no newline after it based on the diff showing no final newline marker).
Implementation Verdict
Ship it — Fix the changelog fragment (likely just needs a trailing newline) and merge once CI passes.
089dad9 to
2ab361e
Compare
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
Follow-up review on commit 2ab361e6. The code changes remain identical to the previous "Ship it" verdict. The changelog fragment file now includes proper formatting based on the full file content shown (ends with proper newline after the last sentence).
CI Status
The "Check changelog fragments" job is still showing as failed ❌, but this may be a stale status or a different validation issue (perhaps the .rst extension on a Python-comment-style file, or scriv configuration expecting a different format). The fragment content itself looks correct.
Implementation Verdict
Ship it — The code is correct. Investigate the changelog CI failure separately (may be a tooling config issue rather than content issue).
4047e42 to
14f21bf
Compare
|
@AntoineRichard @StafaH could you help review this one? |
6c088fa to
45dec31
Compare
After Newton's add_builder copies proto rows into the merged builder
for each environment, every row's label still references the source
proto path. Newton tracks env identity in *_world int companion
columns; IsaacLab keys data flow off USD prim paths, so each row's
label needs to be rewritten to its per-env destination path.
_rename_builder_labels now walks two kinds of label-bearing columns:
* Built-in label arrays (body, joint, shape, articulation,
constraint_mimic, equality_constraint), each paired with its
matching *_world int column.
* String-typed custom-attribute columns (e.g. mujoco:tendon_label)
paired with a references="world" companion at the same frequency.
Any future solver-registered string column at a frequency that has
a references="world" companion is handled automatically.
The prefix match uses a path-separator boundary
(src_path.rstrip("/") + "/") so a source path that is a string
prefix of another (/Sources/protoA vs /Sources/protoAB) does not
cross-contaminate when both feed the same envs.
The function raises ValueError if the parallel arrays of labels and
worlds have mismatched lengths, instead of silently truncating.
Adds test_rename_builder_labels.py with 10 cases covering both
passes, multi-source prefix-overlap regression, sparse env ids,
multi-frequency, multiple string columns at one frequency, and
empty-values pass-through.
Stacks on top of isaac-sim#5523 (Newton v1.2.0rc2 bump). The Newton 1.2
release also includes upstream newton-physics/newton#2659 which
scopes parse_usd's custom-frequency walk natively, removing the
cross-source MjcTendon contamination IsaacLab previously had to
work around at the framework layer.
45dec31 to
9902c18
Compare
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
Fixes per-environment string identifiers (e.g., mujoco:tendon_label) retaining their source proto path after physics replication. The fix extends _rename_builder_labels with a generalized Pass 2 that walks any string-typed custom-attribute column paired with a references="world" companion. Also adds constraint_mimic and equality_constraint to the built-in pass, and fixes a prefix-boundary bug preventing /Sources/protoA from incorrectly matching /Sources/protoAB.
Design Assessment
Design is sound. The two-pass architecture (Pass 1: built-in label arrays, Pass 2: custom-attribute string columns indexed by frequency) is clean, O(n) in attribute count, and automatically handles future solver-registered string columns without code changes. The extracted _rename_pair helper with a length-mismatch guard is a good defensive pattern. Tests are comprehensive (279 lines covering 10 scenarios including the critical prefix-overlap regression).
Findings
No critical or warning-level issues found. The code is correct and well-tested.
🔵 Suggestion: Empty-values guard could distinguish None from [] — source/isaaclab_newton/isaaclab_newton/cloner/newton_replicate.py
The guard if not values or not worlds: continue treats both None (never populated) and [] (populated but empty) the same way. If values is [] but worlds is [0, 1, 2] (an asymmetric corruption), the early return prevents _rename_pair's length-mismatch check from catching it. Consider: if values is None or worlds is None: continue to let the length guard handle the asymmetric case. Low practical risk since Newton's add_builder maintains lockstep invariants.
Test Coverage
- Bug fix regression test: ✅
test_prefix_overlap_does_not_cross_contaminatewould fail without the boundary fix - Pass 1 (built-ins): ✅ All 5 types tested with world-scoped rewrites
- Pass 2 (custom attributes): ✅ Tendon labels, multiple frequencies, multiple columns per frequency, empty columns
- Guard conditions: ✅ Non-path strings, unknown worlds, sparse env_ids
- Coverage adequate: Yes
CI Status
- Build Wheel: ✅ pass
- pre-commit: ✅ pass
- Most other checks: ⏳ pending
Verdict
Ship it ✅
Clean, correct, well-tested bug fix with thoughtful generalization for future columns. The prefix-boundary improvement and extracted helper function improve both correctness and readability.
Update (3fcefb2): ✅ Improvements look good. Extracted _BUILTIN_LABEL_TYPES constant reduces code duplication, tests now import it directly. Refactored rename logic using removeprefix is cleaner and more readable. Added edge-case tests for exact-root matches, trailing slash canonicalization, and large world IDs. LGTM.
Update (d76a21a): Incremental diff contains 231 changed files from what appears to be a rebase/merge (RSL-RL configs, task environments, visualizers, teleop, docs, etc.). The core files for this PR (newton_replicate.py, test_mjc_tendon_cloner.py) are unchanged from the previously-reviewed commit. Original review findings and verdict still apply.
Update (e5a2af2): ✅ Good refinement. Replaced removeprefix with explicit startswith check + slicing in _rename_pair. This correctly prevents labels already rewritten in an earlier source-iteration from being incorrectly processed by subsequent iterations (since removeprefix conflates "no match" with "match with empty suffix"). Clearer comments documenting the boundary-bleed sibling handling. LGTM.
Update (9b78b9d): Incremental diff contains 30 changed files from a rebase incorporating various fixes from the base branch: degenerate-pose handling in create_rotation_matrix_from_view/quat_from_matrix, camera sensor safety improvements, AppLauncher CUDA startup fix, PVA visualizer zero-acceleration fix, and LEAPP export script refactoring. The core PR files (newton_replicate.py, test_mjc_tendon_cloner.py) are unchanged from the previously-reviewed commit. Original review findings and verdict still apply.
* newton_replicate.py: use ``is None`` instead of ``or`` when picking between ``*_label`` and ``*_key`` so an empty-but-defined ``*_label`` list is recognized (the ``or`` fallback over-matched it as falsy). * test_rename_builder_labels.py: add ``equality_constraint`` to the Pass 1 fixture, both per-type loops, and the module docstring so the test surface matches the impl's tuple at line 167.
|
@greptileai Review |
…#5437) ## Summary Adds Newton backend support for \`Isaac-Shadow-Hand-Over-Direct-v0\` (multi-agent MAPPO/IPPO). Selectable via \`--preset newton\` / Hydra preset resolution; PhysX path unchanged. The headline calibration finding: Newton needs \`ImplicitActuatorCfg(stiffness=20.0, damping=2.0)\` on the Shadow Hand fingers — vs PhysX's \`1.0/0.1\` on fingers and \`5.0/0.5\` on wrists — because PhysX layers \`fixed_tendons_props(limit_stiffness=30, damping=0.1)\` and runs \`solver_position_iteration_count=8\` per substep. Both amplify the effective torque per unit nominal gain. Newton's MJWarp implicit-PD path has neither, so larger nominal gains are needed for comparable joint authority. With the bump, MAPPO mean reward at iter 200 / 2048 envs goes from ~27 (no catch learned) to ~777, vs the PhysX baseline of ~247. ## Stack Builds on top of: - **#5433** [Newton] Rename per-env labels in physics replication — required for Shadow Hand fixed tendons to parse correctly under Newton. After #5433 merges, this PR's diff vs develop drops to the 3 shadow-hand-specific files. ## What this PR does ### Newton port wiring (\`shadow_hand_over_env_cfg.py\`) The Newton variant of the Shadow Hand articulation is built as a **delta of the single-agent \`ShadowHandRobotCfg.newton_mjwarp\`** (cross-task import from \`direct/shadow_hand\`), parameterized per-robot \`prim_path\` / \`init_pos\` / \`init_rot\`. Reuses the single-agent's USD path, \`rot\` reapplication workaround, effort limits, and joint regex. Two \`ImplicitActuatorCfg\` overrides on top of the single-agent cfg: * **\`fingers\`** (wrist + per-finger joints): \`stiffness=20.0\` / \`damping=2.0\`. The catch-task gain calibration fix. * **\`distal_passive\`** (the four \`robot0_(FF|MF|RF|LF)J0\` joints): \`stiffness=10.0\` / \`damping=0.1\`. The Newton USD bakes \`stiffness=286 / damping=57\` on these joints from the MJCF→USD translation (which fights the \`MjcTendon\` coupling and bounces the ball). \`stiffness=10\` keeps the joints near-passive while the tendon constraint dominates. PhysX uses tendon coupling on these joints directly and does not need an analogous override. \`PresetCfg\` subclasses follow the established \`physx\`/\`newton_mjwarp\`/\`default=physx\` pattern — same shape as the single-agent Shadow Hand port already on develop. Newton \`ObjectCfg\` drops PhysX-only \`rigid_props\` knobs (per-shape solver iterations, sleep thresholds, max depenetration velocity, custom physics material). Newton scene cloning sets \`clone_in_fabric=False\`. ### Backend portability fix (\`shadow_hand_over_env.py\`) One line: \`self.right_hand.root_view.get_dof_limits()\` → \`self.right_hand.data.joint_limits\`. \`root_view\` is PhysX-only; \`data.joint_limits\` is the backend-portable accessor available on both PhysX and Newton articulations. ### Drift alignment with develop * Newton-preset slot name \`newton_mjwarp\` (matches develop's current convention). * \`PhysxCfg(bounce_threshold_velocity=0.2, gpu_max_rigid_contact_count=2**23, gpu_max_rigid_patch_count=2**23)\` — matches single-agent Shadow Hand's contact-buffer sizing for 2048-env scale. ## Numbers (200 iter / 2048 envs / seed 42 / MAPPO) Captured from tfevents in \`~/workspaces/IsaacLab/logs/skrl/shadow_hand_over/\`: | Setting | Stiffness | Damping | Reward (mean) | Catch learned? | |---|---:|---:|---:|:---:| | PhysX baseline | 1.0 (5.0 wrist) | 0.1 (0.5 wrist) + tendon=30/0.1 | **246.7** | yes | | Newton, develop default | 1.0 | 0.1 | 23.4 | **no** | | Newton, pinned default | 1.0 | 0.1 | 27.7 | **no** | | Newton, h1 probe | 50.0 | 5.0 | 617.8 | yes | | **Newton, this PR** | **20.0** | **2.0** | **777.1** | yes | 20/2 was chosen because it stays closer to the nominal effort-limit budget while still providing enough control authority. Anything in roughly \`[10, 50]\` works. ## Test plan - [x] PhysX 200-iter baseline at 2048 envs (mean 246.7). - [x] Newton 200-iter at stiffness 20/2 — mean 777.1, catch learned. - [x] Newton 200-iter at stiffness 50/5 cross-check — mean 617.8, catch learned. - [x] \`./isaaclab.sh -f\` clean (pre-commit hooks). - [x] Changelog fragment under \`source/isaaclab_tasks/changelog.d/jichuanh-shadow-hand-newton-parity.minor.rst\` (CI-driven version bump on merge; no per-PR \`extension.toml\` edit). ## Out of scope (follow-ups) - **\`EventCfg\` not wired into the env class.** The multi-agent's \`EventCfg\` is defined but never referenced by \`ShadowHandOverEnvCfg.events\` — single-agent's pattern (\`events: ShadowHandEventCfg = ShadowHandEventCfg()\` with PhysX/Newton variants) hasn't been ported. Independent feature add; not a parity blocker. - **Migrating Shadow Hand Newton USD to the \`mujoco-usd-converter\`-produced asset in \`newton-physics/newton-assets/shadow_hand/usd_structured/\`.** That asset uses \`MjcActuator + MjcTendon\` natively (no baked stiffness=286 problem). Switching would let the \`distal_passive\` override be deleted. Requires Nucleus/S3 asset migration + matching the right-hand asset (newton-assets has only left); separate work. - **Behavior-level parity** beyond shaped reward (catch rate, drop rate, ball-trajectory smoothness) is left for a follow-up evaluation.
StafaH
left a comment
There was a problem hiding this comment.
LGTM, just left minor comments
| # Use ``is None`` (not ``or``) so an empty-but-defined ``*_label`` column | ||
| # is recognized — falling through to ``*_key`` would over-match a | ||
| # builder that legitimately exposes both attributes. | ||
| for t in ("body", "joint", "shape", "articulation", "constraint_mimic", "equality_constraint"): |
There was a problem hiding this comment.
Let's pull these out as a constant _BUILTIN_LABEL_TYPES? That way they can be re-used especially for tests
| _DST = "/World/envs/env_{}" | ||
|
|
||
|
|
||
| # ─── helpers ───────────────────────────────────────────────────────────────── |
There was a problem hiding this comment.
Let's remove the excessive comment decoration
| return b | ||
|
|
||
|
|
||
| # ─── tests ─────────────────────────────────────────────────────────────────── |
There was a problem hiding this comment.
Same as above and elsewhere
| for i, src_path in enumerate(sources): | ||
| src_prefix_len = len(src_path.rstrip("/")) | ||
| # Boundary-terminated prefix prevents over-matching when one source path is a | ||
| # prefix of another (e.g. ``/Sources/protoA`` vs ``/Sources/protoAB``). |
There was a problem hiding this comment.
What about the behavior for when the label matches exactly the source root? The old code used to cover that case but the new code does not
* newton_replicate.py:
- Hoist the built-in label types as the module-level
``_BUILTIN_LABEL_TYPES`` constant so tests can import it instead of
duplicating the tuple.
- Replace the ``src_prefix`` / ``src_prefix_len`` / ``swap`` plumbing
with a single ``removeprefix`` step inside ``_rename_pair``. The
new form accepts both an exact source-root match (label value
equals ``src_path``) and a child path (``src_path + "/..."``),
while still rejecting boundary-bleed siblings like
``/Sources/protoAB/x`` when ``src_path`` is ``/Sources/protoA``.
Fixes a regression where a label whose value equals the source
root was silently dropped after the trailing-slash boundary fix.
* test_rename_builder_labels.py:
- Import ``_BUILTIN_LABEL_TYPES`` and collapse the three duplicated
inline tuples.
- Drop the box-drawing banner comments (``# ─── helpers ───``,
``# ─── tests ───``) and the in-class ``# Pass 1`` / ``# Pass 2`` /
``# Cross-pass consistency`` / ``# Guards`` separators — function
and class names already partition the file.
- Add ``test_label_equal_to_source_root_is_rewritten`` (the
exact-match regression).
- Add ``test_trailing_slash_on_source_path_is_canonicalized`` (a
``sources=["/X/"]`` argument must rewrite identically to ``"/X"``).
- Add ``test_large_world_ids`` ([0, 1_000_000, 2_147_000_000])
documenting that dispatch is dict-keyed (constant cost) and the
only hard cap is Newton's int32 ``*_world`` column type.
# Conflicts: # source/isaaclab_newton/isaaclab_newton/cloner/newton_replicate.py # source/isaaclab_newton/test/cloner/test_rename_builder_labels.py
The previous ``removeprefix`` form had a hidden trap: when the source
root is not a prefix of the value, ``str.removeprefix`` returns the
*original string unchanged*. For an already-rewritten label like
``/World/envs/env_0/body``, that returned string still starts with
``/``, so the boundary check ``suffix and not suffix.startswith("/")``
treated it as a legitimate child and re-prepended the dst root, yielding
``/World/envs/env_0/World/envs/env_0/body`` on the second outer-loop
iteration.
Gate on ``startswith(src_root)`` before slicing so "no prefix at all"
exits cleanly. Boundary-bleed siblings (e.g. ``/Sources/protoAB/x``
when ``src_root`` is ``/Sources/protoA``) are still rejected by the
slash-boundary check on the resulting suffix.
Caught by ``TestRenameMultiSource::test_prefix_overlap_does_not_cross_contaminate``.
Summary
Extends
_rename_builder_labelsinisaaclab_newton.cloner.newton_replicateso that every label-bearing column on the merged NewtonModelBuilderis rewritten to per-env USD paths after replication. Previously, only the built-in body/joint/shape/articulation columns were rewritten; tendon labels (and any other string-typed custom-attribute column) kept the source proto path on every replicated environment.Stack / dependencies
The Newton 1.2 release (newton-physics/newton#2659) also includes the upstream tendon-scoping fix that obsoletes the IsaacLab-side `_scope_custom_frequencies` workaround a previous version of this PR carried — that workaround has been removed in favor of relying on the Newton bump in #5523.
Why the rename is needed
Newton's `add_builder` copies each proto's bodies, joints, shapes, articulations, etc. into the merged builder verbatim, and tags each row with a `_world` integer column to track env identity. Labels (path strings) are copied as-is. So after cloning N environments from one proto, the merged builder has N copies of every row, all with the same proto-path string label, distinguished only by the integer `_world` column.
IsaacLab keys most of its data flow off USD prim paths (sensor binding, event-term scope, visualization, logging). It needs labels to be unique per-env paths so a body called `/World/envs/env_3/Robot/Forearm` is reachable by path lookup. The rename function is the bridge: it walks every label-bearing column post-replication and rewrites the source-root prefix to the per-env destination root using each row's `*_world` value.
Until this PR, the rename only walked 5 built-in label arrays. Tendon labels and any string-typed custom-attribute column were missed, so e.g. `mujoco:tendon_label` showed `/World/envs/env_0/...` for every env — surfaced on Shadow Hand fixed tendons.
What this PR changes
`_rename_builder_labels` extension
Tests
New `source/isaaclab_newton/test/cloner/test_rename_builder_labels.py` with 10 cases covering:
Test plan