Skip to content

Fix silent ABI mismatch in cubric Python shim#5444

Merged
kellyguo11 merged 10 commits into
isaac-sim:developfrom
jmart-nv:jmart/cubric-abi
May 15, 2026
Merged

Fix silent ABI mismatch in cubric Python shim#5444
kellyguo11 merged 10 commits into
isaac-sim:developfrom
jmart-nv:jmart/cubric-abi

Conversation

@jmart-nv
Copy link
Copy Markdown

@jmart-nv jmart-nv commented Apr 29, 2026

Description

Background: The _cubric.py ctypes shim was pinned to IAdapter v0.1 vtable offsets, but newer Kit builds ship v0.2 — compute calls were silently landing on unbind, disabling cubric's GPU transform hierarchy propagation. carb accepts the version mismatch with only a stderr warning.

Originally, this change updated offsets to the v0.2 layout, requested v0.2 from the framework, and added a runtime InterfaceDesc check that refused to acquire on any unexpected version.

The kit team is fixing the ABI-breaking semver contract violation upstream, so it won't actually make it into a release. So the pinned version in Isaac Lab remains on v0.1 but keeps the validation code as a safety net.

This problem will go away once we have official python bindings for cubric in a future kit release.

If usdrt eventually exposes the required eRigidBody options via the IFabricHierarchy API then that would massively simplify the implementation of newton manager. Will pursue a feature request.

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation (N/A)
  • My changes generate no new warnings (New warnings on ABI mismatch are intentional)
  • I have added tests that prove my fix is effective or that my feature works (N/A - spoofing kit versions for unit test would be non-trivial; verified manually)
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label Apr 29, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR fixes a silent ABI mismatch in the cubric Python ctypes shim where v0.1 vtable offsets were being used against v0.2 Kit builds, causing compute() calls to land on unbind() instead. The fix updates the offsets, requests v0.2 from carb, and adds a runtime version verification step that refuses to proceed on mismatch. The approach is sound and addresses a real silent-failure bug.

Architecture Impact

Self-contained to _cubric.py. The module is an internal implementation detail of isaaclab_newton.physics — callers already handle the fallback path when CubricBindings.initialize() returns False. No public API changes. The new runtime check adds a safe-fail path that didn't exist before.

Implementation Verdict

Minor fixes needed — the core logic is correct, but there are a few edge cases and clarity improvements worth addressing.

Test Coverage

The PR author explicitly notes testing is N/A due to the difficulty of spoofing Kit versions. This is a reasonable judgment call for a ctypes shim that depends on native carb infrastructure. The runtime verification is effectively a "test" that runs in production. However, the lack of any unit test for the version parsing logic in _verify_iadapter_version is a minor gap — the string/offset parsing could be tested with mock memory layouts.

CI Status

No CI checks available yet — cannot assess pass/fail status.

Findings

🟡 Warning: _cubric.py:259 — interface_count is read as u64 but PluginDesc::interfaceCount is likely size_t or uint32_t

The code reads interface_count as a full 64-bit value at offset 48:

interface_count = _read_u64(plugin_desc_ptr + _PD_OFF_INTERFACE_COUNT)

If the native struct packs interfaceCount as a 32-bit value followed by another 32-bit field, you'll read garbage into the high bits. On little-endian x86_64 this may work by accident if the next 4 bytes are zero, but it's fragile. Consider reading as c_uint64 only if you've verified the struct layout, or use c_size_t which matches platform conventions.

🟡 Warning: _cubric.py:263-273 — Loop doesn't handle potential out-of-bounds read if interface_count is corrupted

The loop trusts interface_count from memory. If the plugin descriptor is malformed or the offset constants are wrong, this could iterate far beyond valid memory. Consider adding a sanity cap:

for i in range(min(interface_count, 64)):  # reasonable upper bound

🔵 Improvement: _cubric.py:269 — ctypes.string_at() without a max length is risky

if ctypes.string_at(name_addr) != b"omni::cubric::IAdapter":

If name_addr points to invalid/non-null-terminated memory, this will read until it finds a null byte or crashes. Consider using ctypes.string_at(name_addr, 256) with a reasonable max length, then strip nulls.

🔵 Improvement: _cubric.py:241 — Consider logging the actual acquired version on success

The info log says "cubric IAdapter bindings ready" but doesn't confirm which version was validated. For debugging purposes:

logger.info("cubric IAdapter v%d.%d bindings ready", _IA_EXPECTED_MAJOR, _IA_EXPECTED_MINOR)

🔵 Improvement: _cubric.py:44-54 — The v0.2 layout comment is helpful but could note what changed from v0.1

The comment documents v0.2 layout but doesn't explicitly show what v0.1 was (the diff shows bindToStageWithListener was inserted at offset 48). A one-line note like # v0.2 added bindToStageWithListener at offset 48, shifting unbind and compute would help future maintainers understand the ABI break.

🔵 Improvement: _cubric.py:246-248 — _verify_iadapter_version could benefit from a docstring explaining the "why"

The method has a one-liner docstring but the critical context — that carb's 0.x negotiation is permissive and emits only stderr warnings — is in a comment at line 58. Moving that rationale into the docstring would help future readers understand why this verification exists.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR fixes a silent ABI mismatch in the _cubric.py ctypes shim where IAdapter::compute calls were landing on unbind because the shim was pinned to v0.1 vtable offsets while newer Kit ships v0.2 (which inserts bindToStageWithListener at offset 48, shifting compute from 56 → 64). The fix updates all affected offsets, bumps the requested interface version to v0.2, and adds a _verify_iadapter_version runtime check that inspects the plugin's PluginDesc and refuses to proceed on any unexpected version.

  • _verify_iadapter_version iterates range(interface_count) where interface_count is read with _read_u64 from a hardcoded struct offset. If the offset assumption is ever wrong, the loop will access unmapped memory and SIGSEGV in CPython. A simple upper-bound guard (e.g. > 256) would catch a struct layout mismatch before the crash.

Confidence Score: 3/5

Safe to merge for the core bug fix, but the new verification path has a potential crash on struct layout mismatch that should be guarded.

The vtable offset correction (56→64) and version bump are clearly correct and the fallback-to-CPU strategy is safe. However, _verify_iadapter_version iterates range(interface_count) without an upper bound; a single wrong offset assumption would turn a graceful warning into a segfault. That P1 finding combined with the undocumented Framework vtable offset at 96 (P2) pulls the score below the P1 ceiling of 4.

source/isaaclab_newton/isaaclab_newton/physics/_cubric.py — specifically _verify_iadapter_version (lines 258-265)

Important Files Changed

Filename Overview
source/isaaclab_newton/isaaclab_newton/physics/_cubric.py Core fix: updates IAdapter vtable offsets to v0.2 layout and adds _verify_iadapter_version runtime check; missing sanity bound on interface_count creates a potential crash path.
source/isaaclab_newton/config/extension.toml Bumps extension version from 0.5.25 → 0.5.26 in line with the bug fix.
source/isaaclab_newton/docs/CHANGELOG.rst Adds 0.5.26 changelog entry describing the ABI fix and CPU fallback behaviour.

Sequence Diagram

sequenceDiagram
    participant C as CubricBindings.initialize()
    participant FW as carb Framework (libcarb.so)
    participant IA as omni::cubric::IAdapter

    C->>FW: acquireFramework("isaaclab.cubric", v0.0)
    FW-->>C: fw_ptr

    C->>FW: tryAcquireInterfaceWithClient(offset 24)<br/>desc={name="omni::cubric::IAdapter", v0.2}
    FW-->>C: ia_ptr (or null → CPU fallback)

    C->>C: _verify_iadapter_version(fw_ptr, ia_ptr)
    C->>FW: getInterfacePluginDesc(offset 96, ia_ptr)
    FW-->>C: plugin_desc_ptr
    C->>C: parse PluginDesc.interfaces[]<br/>find IAdapter entry, check version == (0,2)

    alt version mismatch or null
        C-->>C: log warning, return False → CPU fallback
    else version OK
        C->>IA: read vtable fn-ptrs at offsets 8,32,40,64
        C-->>C: bind _create_fn, _release_fn, _bind_fn, _compute_fn
        Note over C,IA: compute() now calls offset 64 (v0.2)<br/>previously called offset 56 (unbind in v0.2)
    end
Loading

Reviews (1): Last reviewed commit: "OMPE-91419: Fix silent ABI mismatch in c..." | Re-trigger Greptile

Comment thread source/isaaclab_newton/isaaclab_newton/physics/_cubric.py
Comment thread source/isaaclab_newton/isaaclab_newton/physics/_cubric.py Outdated
@pbarejko pbarejko self-requested a review April 29, 2026 22:34
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

All previous findings have been addressed in this revision.

Previous Concerns Status

  • interface_count sanity cap: Added check if interface_count > 64 with appropriate warning
  • string_at length limit: Now uses ctypes.string_at(name_addr, len(target_name)) with explicit length
  • Docstring improvement: _verify_iadapter_version now has a proper docstring explaining the carb 0.x permissive negotiation rationale
  • v0.2 layout comment: Full vtable layout is now documented with clear offset annotations
  • Version logging: The info message already includes context; the warning messages now explicitly log version numbers

Implementation Verdict

Ship it — the code is defensive, well-documented, and handles the edge cases appropriately.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

All previous findings were addressed in the prior commit. The new revision (4cfff73) contains no additional code changes beyond what was already reviewed — the diff is identical to the previously approved state.

Implementation Verdict

Ship it — the ABI mismatch fix is correct, defensive, and well-documented. The version verification logic properly guards against silent vtable miscalls.

CI Status

The "Check for Broken Links" failure appears unrelated to this PR's changes (no documentation links were modified in the cubric shim). All code-related checks (pre-commit, Build Wheel) pass.

Comment thread source/isaaclab_newton/isaaclab_newton/physics/_cubric.py
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

The new commit (71b6fbf) is identical to the previously reviewed state — no code changes since the last approval. All previous findings remain addressed.

Implementation Verdict

Ship it — the ABI mismatch fix is correct and the version verification guards against the silent vtable miscall failure mode.

CI Status

"Check for Broken Links" failure is unrelated to this PR (no docs modified). Code checks pass.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

The new commit (0c82fe3) is identical to the previously reviewed state — no code changes since the last review. The ABI mismatch fix remains correct with proper v0.2 vtable offsets and runtime version verification.

Implementation Verdict

Ship it.

CI Status

"Check for Broken Links" failure is unrelated to this PR (documentation link checker, no docs modified in this change).

@jmart-nv jmart-nv force-pushed the jmart/cubric-abi branch from 0c82fe3 to b827264 Compare May 1, 2026 22:02
@jmart-nv
Copy link
Copy Markdown
Author

jmart-nv commented May 1, 2026

kit will be un-breaking the semver minor version compatibility contract violation upstream, so the breaking ABI change will never actually make it into a kit release. Reverted our shim back to requesting v0.1, relaxed the warning to use the logger instead, but kept the validation code as a safeguard.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

The new commit (b827264) reverts the fix back to v0.1 offsets while keeping the version verification infrastructure. This contradicts the PR description which claims to update offsets to v0.2.

Implementation Verdict

Needs rework — critical inconsistency between PR description and code.

Findings

🔴 Critical: _cubric.py:47-64 — PR description/code mismatch

The PR description states:

"Update offsets to the v0.2 layout, request v0.2 from the framework"

But the code pins to v0.1:

_IA_EXPECTED_MAJOR = 0
_IA_EXPECTED_MINOR = 1  # Should be 2 per PR description

And the vtable comment says "v0.1 layout" (line 47). If v0.2 has different offsets (which the PR title implies), these offsets are wrong and compute calls will still land on the wrong vtable slot.

Either:

  1. The PR description is wrong and this is just adding version verification for v0.1, or
  2. The code is wrong and should have _IA_EXPECTED_MINOR = 2 with updated offsets

Please clarify and fix the inconsistency.

@jmart-nv
Copy link
Copy Markdown
Author

jmart-nv commented May 1, 2026

@isaaclab-review-bot updated the PR description

@jmart-nv jmart-nv force-pushed the jmart/cubric-abi branch from b827264 to da9a41b Compare May 6, 2026 21:06
jmart-nv added 6 commits May 12, 2026 13:13
The _cubric.py ctypes shim was pinned to IAdapter v0.1 vtable offsets, but newer Kit builds ship v0.2 — compute calls were silently landing on unbind, disabling cubric's GPU transform hierarchy propagation. carb accepts the version mismatch with only a stderr warning.

Update offsets to the v0.2 layout, request v0.2 from the framework, and add a runtime InterfaceDesc check that refuses to acquire on any unexpected version.
- Capped interface_count at 64 as a sanity check.
- Bound string_at read length to expected value.
- Clarified some comments.
…n higher minor version mismatch, fall back to CPU (non-cubric) path on lower minor or major version mismatch.
…to logger - kit is un-breaking the semver compatibility violation upstream.
@jmart-nv jmart-nv force-pushed the jmart/cubric-abi branch from da9a41b to 0e5ef21 Compare May 12, 2026 18:15
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

The latest commit (0e5ef21) adds a changelog fragment (jmart-cubric-abi.rst). The code in _cubric.py is unchanged since the v0.1 revert (6272ce2).

Previous Concern Resolved

The v0.1/v0.2 inconsistency raised in the prior review is now addressed by the updated PR description, which explains that the Kit team is fixing the ABI-breaking semver violation upstream, so Isaac Lab remains pinned to v0.1 while keeping the runtime version verification as defense-in-depth. This is the correct decision — the validation code will catch any future ABI shifts without unnecessarily targeting a version that won't ship.

Final Review

Changelog (new file): Well-written, concise, accurately describes the change as adding runtime verification with CPU path fallback.

Code (_cubric.py):

  • ✅ Version constants pinned to v0.1 (correct per updated scope)
  • _verify_iadapter_version properly validates via getInterfacePluginDesc
  • ✅ Interface count sanity-capped at 64
  • string_at bounded with explicit length
  • ✅ Graceful fallback to update_world_xforms() on any mismatch
  • ✅ Warning on newer minor version (semver-compatible, but flags potential vtable drift)
  • ✅ All null-pointer paths handled with appropriate logging

Implementation Verdict

Ship it. The fix is defensive, well-documented, and correctly scoped. The validation infrastructure will serve as an early-warning system against future ABI breaks without introducing unnecessary risk.


Update (0b790cd): Merge commit from develop. The only change to _cubric.py is a minor cleanup: the interface acquisition now uses b"isaaclab.cubric" as the client name instead of trying multiple strategies with carb.scripting-python.plugin and None. This is a reasonable simplification — the previous code was defensive but verbose. All other files in this commit are unrelated changes from the develop branch merge. ✅ Ship it.

@kellyguo11 kellyguo11 moved this to Ready to merge in Isaac Lab May 15, 2026
@kellyguo11 kellyguo11 changed the title OMPE-91419: Fix silent ABI mismatch in cubric Python shim Fix silent ABI mismatch in cubric Python shim May 15, 2026
@kellyguo11 kellyguo11 merged commit 21745e2 into isaac-sim:develop May 15, 2026
33 of 34 checks passed
@github-project-automation github-project-automation Bot moved this from Ready to merge to Done in Isaac Lab May 15, 2026
@isaaclab-review-bot isaaclab-review-bot Bot mentioned this pull request May 16, 2026
7 tasks
matthewtrepte pushed a commit to matthewtrepte/IsaacLab that referenced this pull request May 18, 2026
# Description

Background: The _cubric.py ctypes shim was pinned to IAdapter v0.1
vtable offsets, but newer Kit builds ship v0.2 — compute calls were
silently landing on unbind, disabling cubric's GPU transform hierarchy
propagation. carb accepts the version mismatch with only a stderr
warning.

Originally, this change updated offsets to the v0.2 layout, requested
v0.2 from the framework, and added a runtime InterfaceDesc check that
refused to acquire on any unexpected version.

The kit team is fixing the ABI-breaking semver contract violation
upstream, so it won't actually make it into a release. So the pinned
version in Isaac Lab remains on v0.1 but keeps the validation code as a
safety net.

This problem will go away once we have official python bindings for
cubric in a future kit release.

If usdrt eventually exposes the required `eRigidBody` options via the
`IFabricHierarchy` API then that would massively simplify the
implementation of newton manager. Will pursue a feature request.

## Type of change

- Bug fix (non-breaking change which fixes an issue)

## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation *(N/A)*
- [ ] My changes generate no new warnings *(New warnings on ABI mismatch
are intentional)*
- [ ] I have added tests that prove my fix is effective or that my
feature works *(N/A - spoofing kit versions for unit test would be
non-trivial; verified manually)*
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

---------

Co-authored-by: Kelly Guo <kellyg@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants