Adds inhand manipulation warp env by hujc7 · Pull Request #4812 · isaac-sim/IsaacLab

hujc7 · 2026-03-04T10:22:05Z

Summary

isaaclab_experimental

Add DirectRLEnvWarp base class with CUDA graph capture support
Add InteractiveSceneWarp — extends InteractiveScene with warp-native env_mask for selective resets without modifying the base class
Add WarpGraphCache utility — centralizes warp CUDA graph capture-or-replay pattern into a single capture_or_replay(name, fn) call
Add Timer utility with DEBUG_TIMER_STEP / DEBUG_TIMERS env var toggles
Add spaces utilities for warp-native observation/action spaces

isaaclab_newton

Update Newton dependency to 1.0.0rc3

isaaclab_rl

Add DirectRLEnvWarp to RslRlVecEnvWrapper isinstance check

isaaclab_tasks_experimental

Add inhand manipulation warp environment (Allegro hand + cube repose)
Allegro cfg aligned with stable newton preset (decimation=4, dt=1/120, njmax=80, nconmax=70)
Fix object velocity observation ordering to match torch reference (linear first, then angular)

scripts

Import isaaclab_tasks_experimental in train.py

Notes

This PR is intentionally based on develop at 5002c0ccb4a (before #4818 "Brings Newton assets integration tests"), which introduced a body_inertia reshape incompatible with Newton RC3 (Reshaping non-contiguous arrays is unsupported).

Known Problems / TODOs

Actuator pipeline not yet warp-native: _apply_actuator_model still uses torch. Runs outside CUDA graph capture scope as a workaround.
njmax/nconmax tuning: reverted to stable cfg defaults

Test plan

CUDA graph capture succeeds
Training runs without crashes (4096 envs, 300 iters)
Full training: ~73ms/step, ~43k steps/s, reward ~100+
Newton RC3 compatibility verified

hujc7 · 2026-03-05T07:57:04Z

@greptileai review

hujc7 · 2026-03-05T08:03:02Z

@AntoineRichard I am wondering if this might be related to the not training issue you are fixing, njmax and nconmax update compared to dev/newton seems to be the correct fix here, but it hits performance really badly.

greptile-apps · 2026-03-05T08:04:27Z

Greptile Summary

This PR introduces a Warp-native inhand manipulation environment (InHandManipulationWarpEnv) with CUDA graph capture support, backed by new DirectRLEnvWarp and InteractiveSceneWarp base classes, a WarpGraphCache utility, and a Timer utility — all under new isaaclab_experimental and isaaclab_tasks_experimental packages. Previously-raised issues regarding optional import guarding, find_packages() discovery, obs_type validation, velocity observation ordering, and None-mask handling have all been addressed.

One remaining concern:

InteractiveSceneWarp.reset partial-mask inconsistency: Deformable objects and surface grippers are reset with env_ids=None (all envs) even when a partial env_mask is supplied, which will silently over-reset environments in any future scene that uses those asset types.

Confidence Score: 4/5

Safe to merge for the current inhand manipulation task; future scene extensions using deformable objects or surface grippers may experience unintended full-reset behavior when partial masks are used.
Training from scratch works as confirmed by the test plan (reward ~100+, 43k steps/s, 300 iter runs without crashes). The core Warp environment, CUDA graph capture, and integration with RSL-RL are all functioning correctly. One remaining issue: InteractiveSceneWarp.reset() has inconsistent mask handling for deformable objects and surface grippers, which is a correctness concern for future subclasses but not a blocker for the current task that uses only articulations and rigid objects.
source/isaaclab_experimental/isaaclab_experimental/envs/interactive_scene_warp.py (deformable object and surface gripper reset mask handling).

_{Last reviewed commit: 0949f42}

greptile-apps · 2026-03-05T08:04:31Z

 logger = logging.getLogger(__name__)

 # PLACEHOLDER: Extension template (do not remove this comment)
+import isaaclab_tasks_experimental  # noqa: F401


Unconditional import breaks existing training workflows. The module-level import isaaclab_tasks_experimental will cause an ImportError for any user who does not have the experimental package installed, breaking all existing training scripts that use rsl_rl/train.py for unrelated tasks.

Wrap the import in a try/except (following the "Extension template" pattern referenced in the comment above):

Suggested change

import isaaclab_tasks_experimental # noqa: F401

try:

import isaaclab_tasks_experimental # noqa: F401

except ImportError:

pass

greptile-apps · 2026-03-05T08:04:32Z

+        from isaaclab_experimental.envs import DirectRLEnvWarp
+
        from isaaclab.envs import DirectRLEnv, ManagerBasedEnv, ManagerBasedRLEnv


Hard import of optional package breaks all RslRlVecEnvWrapper usage. The import from isaaclab_experimental.envs import DirectRLEnvWarp is placed inside __init__ with no try/except guard. Every call to RslRlVecEnvWrapper(...) will fail with ImportError if isaaclab_experimental is not installed, which is a breaking change for all existing RSL-RL users.

Guard the import to keep the package optional:

Suggested change

from isaaclab_experimental.envs import DirectRLEnvWarp

from isaaclab.envs import DirectRLEnv, ManagerBasedEnv, ManagerBasedRLEnv

try:

from isaaclab_experimental.envs import DirectRLEnvWarp as _DirectRLEnvWarp

_warp_types = (_DirectRLEnvWarp,)

except ImportError:

_warp_types = ()

if not isinstance(env.unwrapped, (ManagerBasedRLEnv, ManagerBasedEnv, DirectRLEnv, *_warp_types)):

raise ValueError(

"The environment must be inherited from ManagerBasedRLEnv or DirectRLEnv. Environment type:"

f" {type(env)}"

)

greptile-apps · 2026-03-05T08:04:33Z

+    python_requires=">=3.10",
+    install_requires=INSTALL_REQUIRES,
+    dependency_links=PYTORCH_INDEX_URL,
+    packages=["isaaclab_experimental"],


Sub-packages excluded from installation — all imports will fail after pip install. The packages list only includes the top-level isaaclab_experimental package, but does not include the sub-packages isaaclab_experimental.envs and isaaclab_experimental.envs.utils. After a normal pip install ., the import from isaaclab_experimental.envs import DirectRLEnvWarp will raise ModuleNotFoundError.

Use find_packages() to automatically discover all sub-packages:

Suggested change

packages=["isaaclab_experimental"],

from setuptools import find_packages, setup

setup(

# ... existing fields ...

packages=find_packages(),

# ... rest of setup ...

)

greptile-apps · 2026-03-05T08:04:34Z

+        wp.launch(
+            zero_mask_int32,
+            dim=self.num_envs,
+            inputs=[
+                mask,
+                self._episode_length_buf_wp,
+            ],
+        )


_reset_idx(None) will crash the zero_mask_int32 kernel. The method accepts mask: wp.array | torch.Tensor | None = None, but the wp.launch() call at line 707 passes mask directly to the kernel without a guard for None. The kernel will immediately attempt to read mask[env_index], which will fail if mask is None.

Although InHandManipulationWarpEnv._reset_idx adds a None → _ALL_ENV_MASK guard before calling super()._reset_idx(), the base-class signature implies None should be handled at the base level. Future subclasses that call super()._reset_idx() without this guard will crash silently.

Add the guard in the base class:

Suggested change

wp.launch(

zero_mask_int32,

dim=self.num_envs,

inputs=[

mask,

self._episode_length_buf_wp,

],

)

def _reset_idx(self, mask: wp.array | torch.Tensor | None = None):

"""Reset environments based on specified indices.

Args:

mask: A warp array or torch tensor indicating which environments to reset.

"""

if mask is None:

mask = self._ALL_ENV_MASK

self.scene.reset(env_ids=None, env_mask=mask)

# ... rest of method ...

Line 712 has if mask is None: mask = self._ALL_ENV_MASK — None never reaches the kernel.

greptile-apps · 2026-03-05T08:04:35Z

+    def _pre_physics_step(self, actions: torch.Tensor):
+        """Pre-process actions before stepping through the physics.
+
+        This function is responsible for pre-processing the actions before stepping through the physics.
+        It is called before the physics stepping (which is decimated).
+
+        Args:
+            actions: The actions to apply on the environment. Shape is (num_envs, action_dim).
+        """
+        raise NotImplementedError(f"Please implement the '_pre_physics_step' method for {self.__class__.__name__}.")
+
+    @abstractmethod
+    def _apply_action(self):
+        """Apply actions to the simulator.
+
+        This function is responsible for applying the actions to the simulator. It is called at each
+        physics time-step.
+        """
+        raise NotImplementedError(f"Please implement the '_apply_action' method for {self.__class__.__name__}.")
+
+    @abstractmethod
+    def _get_observations(self) -> VecEnvObs:
+        """Compute and return the observations for the environment.
+
+        Returns:
+            The observations for the environment.
+        """
+        raise NotImplementedError(f"Please implement the '_get_observations' method for {self.__class__.__name__}.")
+
+    def _get_states(self) -> VecEnvObs | None:
+        """Compute and return the states for the environment.
+
+        The state-space is used for asymmetric actor-critic architectures. It is configured
+        using the :attr:`DirectRLEnvCfg.state_space` parameter.
+
+        Returns:
+            The states for the environment. If the environment does not have a state-space, the function
+            returns a None.
+        """
+        return None  # noqa: R501
+
+    @abstractmethod
+    def _get_rewards(self) -> torch.Tensor:
+        """Compute and return the rewards for the environment.
+
+        Returns:
+            The rewards for the environment. Shape is (num_envs,).
+        """
+        raise NotImplementedError(f"Please implement the '_get_rewards' method for {self.__class__.__name__}.")
+
+    @abstractmethod
+    def _get_dones(self) -> tuple[torch.Tensor, torch.Tensor]:
+        """Compute and return the done flags for the environment.
+
+        Returns:
+            A tuple containing the done flags for termination and time-out.
+            Shape of individual tensors is (num_envs,).
+        """
+        raise NotImplementedError(f"Please implement the '_get_dones' method for {self.__class__.__name__}.")


Abstract method type hints contradict the Warp-based implementation contract. The base-class abstract methods declare:

_pre_physics_step(actions: torch.Tensor) — but implementation passes wp.from_torch(action) and receives wp.array

_get_rewards() -> torch.Tensor — but InHandManipulationWarpEnv._get_rewards returns None and writes to Warp buffers

_get_dones() -> tuple[torch.Tensor, torch.Tensor] — but implementation returns None and writes to Warp buffers

Mismatched abstract signatures mislead subclass authors and type checkers. Update to match the actual contract:

Suggested change

def _pre_physics_step(self, actions: torch.Tensor):

"""Pre-process actions before stepping through the physics.

This function is responsible for pre-processing the actions before stepping through the physics.

It is called before the physics stepping (which is decimated).

Args:

actions: The actions to apply on the environment. Shape is (num_envs, action_dim).

"""

raise NotImplementedError(f"Please implement the '_pre_physics_step' method for {self.__class__.__name__}.")

@abstractmethod

def _apply_action(self):

"""Apply actions to the simulator.

This function is responsible for applying the actions to the simulator. It is called at each

physics time-step.

"""

raise NotImplementedError(f"Please implement the '_apply_action' method for {self.__class__.__name__}.")

@abstractmethod

def _get_observations(self) -> VecEnvObs:

"""Compute and return the observations for the environment.

Returns:

The observations for the environment.

"""

raise NotImplementedError(f"Please implement the '_get_observations' method for {self.__class__.__name__}.")

def _get_states(self) -> VecEnvObs | None:

"""Compute and return the states for the environment.

The state-space is used for asymmetric actor-critic architectures. It is configured

using the :attr:`DirectRLEnvCfg.state_space` parameter.

Returns:

The states for the environment. If the environment does not have a state-space, the function

returns a None.

"""

return None # noqa: R501

@abstractmethod

def _get_rewards(self) -> torch.Tensor:

"""Compute and return the rewards for the environment.

Returns:

The rewards for the environment. Shape is (num_envs,).

"""

raise NotImplementedError(f"Please implement the '_get_rewards' method for {self.__class__.__name__}.")

@abstractmethod

def _get_dones(self) -> tuple[torch.Tensor, torch.Tensor]:

"""Compute and return the done flags for the environment.

Returns:

A tuple containing the done flags for termination and time-out.

Shape of individual tensors is (num_envs,).

"""

raise NotImplementedError(f"Please implement the '_get_dones' method for {self.__class__.__name__}.")

@abstractmethod

def _pre_physics_step(self, actions: wp.array) -> None:

"""Pre-process actions before stepping through the physics."""

raise NotImplementedError(...)

@abstractmethod

def _get_rewards(self) -> None:

"""Compute and store rewards in Warp buffers."""

raise NotImplementedError(...)

@abstractmethod

def _get_dones(self) -> None:

"""Compute and store done flags in Warp buffers."""

raise NotImplementedError(...)

Type hints already use wp.array and -> None.

hujc7 · 2026-03-05T09:10:43Z

@greptileai Review

greptile-apps · 2026-03-05T09:16:21Z

+    description=EXTENSION_TOML_DATA["package"]["description"],
+    keywords=EXTENSION_TOML_DATA["package"]["keywords"],
+    include_package_data=True,
+    python_requires=">=3.10",


Sub-packages excluded from installation

packages=["isaaclab_tasks_experimental"] only includes the top-level package. After pip install ., the sub-packages isaaclab_tasks_experimental.direct, isaaclab_tasks_experimental.direct.allegro_hand, and isaaclab_tasks_experimental.direct.inhand_manipulation will not be installed. Any import isaaclab_tasks_experimental.direct.allegro_hand will raise ModuleNotFoundError.

The companion isaaclab_experimental/setup.py already uses find_packages() (correctly) — the same fix should be applied here:

Suggested change

python_requires=">=3.10",

from setuptools import find_packages, setup

and change:

packages=find_packages(),

greptile-apps · 2026-03-05T09:16:22Z

+
+from isaaclab_tasks.utils import import_packages
+
+# The blacklist is used to prevent importing configs from sub-packages
+_BLACKLIST_PKGS = ["utils", ".mdp"]
+# Import all configs in this package


Undeclared isaaclab_tasks dependency

import_packages is imported from isaaclab_tasks.utils, but isaaclab_tasks is not listed as a dependency anywhere — neither in extension.toml (which only declares isaaclab and isaaclab_assets) nor in setup.py (install_requires=[]). If a user has isaaclab_tasks_experimental installed without isaaclab_tasks, importing this package will fail with an ImportError.

Either add isaaclab_tasks to the dependency declarations, or make the import conditional:

Suggested change

from isaaclab_tasks.utils import import_packages

# The blacklist is used to prevent importing configs from sub-packages

_BLACKLIST_PKGS = ["utils", ".mdp"]

# Import all configs in this package

try:

from isaaclab_tasks.utils import import_packages

import_packages(__name__, _BLACKLIST_PKGS)

except ImportError:

pass

greptile-apps · 2026-03-05T09:16:27Z

+):
+    env_id = wp.tid()
+    if env_mask[env_id]:
+        rand0 = wp.randf(rng_state[env_id], wp.float32(-1.0), wp.float32(1.0))


goal_pos_w updated unconditionally for all environments

The last line of the reset_target_pose kernel:

goal_pos_w[env_id] = goal_pos[env_id] + env_origins[env_id]

…runs outside the if env_mask[env_id]: guard. Every time this kernel is launched — including the per-step call from _get_rewards() inside the CUDA graph (which may only need to reset goal-reaching envs) — all num_envs entries of goal_pos_w are unconditionally rewritten.

Because goal_pos and env_origins are constants, the result is always the same and the behavior is correct. However, the unconditional write inside a captured CUDA graph adds unnecessary write traffic on every step at scale (e.g. 8 192 environments). Consider moving this initialisation to a one-time setup step, or placing it inside the if env_mask[env_id]: branch if goal_pos_w only needs updating when a goal is reset.

Idempotent — goal_pos and env_origins are unchanged for non-masked envs. No correctness issue.

hujc7 · 2026-03-06T23:56:00Z

@greptileai Review

greptile-apps · 2026-03-07T00:03:24Z

+            delta_max = upper_limits[env_id, dof_id] - default_joint_pos[env_id, dof_id]
+            delta_min = lower_limits[env_id, dof_id] - default_joint_pos[env_id, dof_id]
+            rand_delta = delta_min + (delta_max - delta_min) * 0.5 * dof_pos_noise
+            pos = default_joint_pos[env_id, dof_id] + reset_dof_pos_noise * rand_delta


Biased reset distribution in reset_hand kernel

The rand_delta formula produces a distribution that is not uniform over [delta_min, delta_max]. With dof_pos_noise ~ U[-1, 1]:

# Current (incorrect): rand_delta = delta_min + (delta_max - delta_min) * 0.5 * dof_pos_noise # => rand_delta ~ U[delta_min - 0.5*range, delta_min + 0.5*range] # => the distribution is centred at delta_min, NOT at the midpoint

For example, if delta_min = -0.2 and delta_max = 0.2:

Current: rand_delta ~ U[-0.4, 0.0] — all reset positions are shifted downward.

Correct: rand_delta ~ U[-0.2, 0.2]

The standard pattern used by the Torch counterpart (and the original IsaacGym implementation) first maps the uniform sample to [0, 1]:

rand_delta = delta_min + (delta_max - delta_min) * wp.float32(0.5) * (wp.float32(1.0) + dof_pos_noise)

Without this fix the Warp env resets hand joints to systematically lower (or higher) positions than intended, which can negatively affect training.

Matches the torch reference implementation's behavior.

greptile-apps · 2026-03-07T00:03:25Z

+        if self.cfg.obs_type == "openai":
+            self.compute_reduced_observations()
+        elif self.cfg.obs_type == "full":
+            self.compute_full_observations()
+        else:
+            print("Unknown observations type!")
+        return {"policy": self.torch_obs_buf}


Unknown obs_type silently uses stale observations

When cfg.obs_type is neither "openai" nor "full", the observation buffer is never written and the policy trains on whatever garbage was left from the previous step. print() is not an adequate safeguard here — a misconfigured obs_type will cause silent training corruption rather than an immediate, actionable failure.

Consider raising a ValueError instead:

Suggested change

if self.cfg.obs_type == "openai":

self.compute_reduced_observations()

elif self.cfg.obs_type == "full":

self.compute_full_observations()

else:

print("Unknown observations type!")

return {"policy": self.torch_obs_buf}

if self.cfg.obs_type == "openai":

self.compute_reduced_observations()

elif self.cfg.obs_type == "full":

self.compute_full_observations()

else:

raise ValueError(f"Unknown observations type: '{self.cfg.obs_type}'. Expected 'openai' or 'full'.")

greptile-apps · 2026-03-07T00:03:26Z


    def write_data_to_sim(self):


env_mask kwarg may break sensors that don't support it

sensor.reset(env_ids, env_mask=env_mask) is called unconditionally for every sensor in the scene. If any sensor type's reset() signature does not accept an env_mask keyword argument, this will raise a TypeError at runtime — even for environments that never pass a mask.

The rigid_object.reset and articulation.reset calls are presumably updated to accept the new parameter, but sensors are heterogeneous (cameras, contact sensors, etc.) and may not all have been updated. At minimum, a guard is needed:

import inspect for sensor in self._sensors.values(): if "env_mask" in inspect.signature(sensor.reset).parameters: sensor.reset(env_ids, env_mask=env_mask) else: sensor.reset(env_ids)

Or, if env_mask is now part of the base sensor reset() contract, each sensor subclass should be audited to confirm it accepts the kwarg.

Done. Moved env_mask support to InteractiveSceneWarp in isaaclab_experimental, reverted base class.

greptile-apps · 2026-03-07T00:03:27Z

+    def _get_observations(self) -> dict:
+        # if self.cfg.asymmetric_obs:
+        #    self.fingertip_force_sensors = self.hand.root_physx_view.get_link_incoming_joint_force()[
+        #        :, self.finger_bodies
+        #    ]
+        if self.cfg.obs_type == "openai":
+            self.compute_reduced_observations()
+        elif self.cfg.obs_type == "full":
+            self.compute_full_observations()
+        else:
+            print("Unknown observations type!")
+        return {"policy": self.torch_obs_buf}
+


observation_space mismatch when using obs_type = "openai"

observation_space = 124 is sized for "full" observations, but the "openai" (reduced) variant only writes 35 elements (12 fingertip positions + 3 object pos + 4 relative rotation + 16 actions). If a subclass or experiment sets obs_type = "openai" without also changing observation_space, the remaining 89 elements of the observation buffer will contain uninitialised data that is silently fed to the policy.

Consider documenting the expected observation_space for each obs_type variant (e.g., 124 for "full", 35 for "openai") in a class docstring or config comment, and adding a runtime assertion:

_OBS_SIZES = {"full": 124, "openai": 35} def _get_observations(self) -> dict: expected = _OBS_SIZES.get(self.cfg.obs_type) if expected is not None and self.cfg.observation_space != expected: raise ValueError( f"obs_type='{self.cfg.obs_type}' requires observation_space={expected}, " f"but cfg.observation_space={self.cfg.observation_space}" ) ...

Acknowledged — openai obs_type is not the default and would need observation_space override.

hujc7 · 2026-03-07T19:30:07Z

@greptileai Review

greptile-apps · 2026-03-07T19:37:14Z

+    observations[env_id, offset + 0] = object_vels[env_id][0]
+    observations[env_id, offset + 1] = object_vels[env_id][1]
+    observations[env_id, offset + 2] = object_vels[env_id][2]
+    offset += 3
+
+    observations[env_id, offset + 0] = vel_obs_scale * object_vels[env_id][3]
+    observations[env_id, offset + 1] = vel_obs_scale * object_vels[env_id][4]
+    observations[env_id, offset + 2] = vel_obs_scale * object_vels[env_id][5]
+    offset += 3


Object velocity ordering and scaling inverted vs. Torch reference

In Newton/Warp, wp.spatial_vectorf stores (angular_x, angular_y, angular_z, linear_x, linear_y, linear_z) — angular velocity first. The current code therefore writes unscaled angular velocity at offset+0..+2 and scaled linear velocity at offset+3..+5.

The reference Torch InHandManipulationEnv observation is built as:

object_linvel, # linear velocity, unscaled — first vel_obs_scale * object_angvel, # angular velocity, scaled — second

This produces both a component-order swap and a scaling assignment swap. The trained observation vector will not match the Torch reference, which will break cross-environment policy transfer and produce a different observation distribution than intended.

The fix should swap the two blocks and their scaling:

# linear velocity (Warp indices 3–5) — unscaled, first observations[env_id, offset + 0] = object_vels[env_id][3] observations[env_id, offset + 1] = object_vels[env_id][4] observations[env_id, offset + 2] = object_vels[env_id][5] offset += 3 # angular velocity (Warp indices 0–2) — scaled, second observations[env_id, offset + 0] = vel_obs_scale * object_vels[env_id][0] observations[env_id, offset + 1] = vel_obs_scale * object_vels[env_id][1] observations[env_id, offset + 2] = vel_obs_scale * object_vels[env_id][2] offset += 3

- Add WarpGraphCache capture-or-replay utility to isaaclab_experimental/utils - Refactor direct_rl_env_warp to use WarpGraphCache, split step_warp_end into pre/post to run write_data_to_sim uncaptured between graph segments

hujc7 · 2026-03-07T20:12:45Z

@greptileai Review

greptile-apps · 2026-03-07T20:28:35Z

+        for deformable_object in self._deformable_objects.values():
+            deformable_object.reset(env_ids)
+        for rigid_object in self._rigid_objects.values():
+            rigid_object.reset(env_ids, env_mask=env_mask)
+        for surface_gripper in self._surface_grippers.values():
+            surface_gripper.reset(env_ids)


Deformable objects and surface grippers ignore the env_mask

When reset(env_ids=None, env_mask=some_partial_mask) is called, deformable objects and surface grippers both receive reset(env_ids=None), which falls through to their default behaviour of resetting all environments. This is semantically inconsistent: partial-reset callers expect only the masked envs to be touched, but any scene that contains deformable objects or surface grippers will silently over-reset.

While the current InHandManipulationWarpEnv doesn't use either asset type, this silent correctness hole will trip any future subclass that does. Consider converting env_mask to env_ids when calling reset on these assets, or at minimum adding a prominent comment noting the limitation:

for deformable_object in self._deformable_objects.values(): # NOTE: env_mask not yet supported; falls back to full reset. deformable_object.reset(env_ids) for surface_gripper in self._surface_grippers.values(): # NOTE: same caveat as above. surface_gripper.reset(env_ids)

hujc7 · 2026-03-11T06:11:18Z

Absorbed into #4905 which combines the inhand-cp and direct warp env changes.

## Summary Adds experimental warp infrastructure and direct warp environments from `dev/newton`, adapted for `develop`. Absorbs PR #4812 (inhand-cp). ### `isaaclab_experimental` * `DirectRLEnvWarp` base class with CUDA graph capture via `WarpGraphCache` * `InteractiveSceneWarp` with warp-native env_mask reset support * `episode_length_buf` property with in-place copy to preserve warp/torch shared memory ### `isaaclab_tasks_experimental` (direct envs) * **Cartpole** (`Isaac-Cartpole-Direct-Warp-v0`) * **Ant** (`Isaac-Ant-Direct-Warp-v0`) * **Humanoid** (`Isaac-Humanoid-Direct-Warp-v0`) * **Locomotion** base warp env (shared by ant/humanoid) * **InHand Manipulation** + **Allegro Hand** * Agent configs reference stable `isaaclab_tasks.direct.<env>.agents` directly — no duplication ### API adaptations for `develop` * `find_joints` 2-value return (indices, names) * `episode_length_buf` as property with in-place `copy_()` for warp/torch shared memory * `self._ALL_ENV_MASK` from base env * `set_joint_effort_target_mask` for CUDA graph compatibility * `_get_observations` returns `{"policy": tensor}` dict * `safe_normalize` to guard `wp.normalize` on zero-length vectors * Solver configs aligned with stable develop `PresetCfg` values ### Test results (rsl_rl, 4096 envs, 300 iterations, headless, `newton==1.0.0`) | Env | Status | Time | |-----|--------|------| | Cartpole | PASS | 70s | | Ant | PASS | 98s | | Humanoid | PASS | 172s | ## Test plan - [x] Cartpole: 300 iteration training converges - [x] Ant: 300 iteration training converges - [x] Humanoid: 300 iteration training converges --------- Signed-off-by: Antoine RICHARD <antoiner@nvidia.com> Co-authored-by: Antoine RICHARD <antoiner@nvidia.com>

## Summary Adds experimental warp infrastructure and direct warp environments from `dev/newton`, adapted for `develop`. Absorbs PR isaac-sim#4812 (inhand-cp). ### `isaaclab_experimental` * `DirectRLEnvWarp` base class with CUDA graph capture via `WarpGraphCache` * `InteractiveSceneWarp` with warp-native env_mask reset support * `episode_length_buf` property with in-place copy to preserve warp/torch shared memory ### `isaaclab_tasks_experimental` (direct envs) * **Cartpole** (`Isaac-Cartpole-Direct-Warp-v0`) * **Ant** (`Isaac-Ant-Direct-Warp-v0`) * **Humanoid** (`Isaac-Humanoid-Direct-Warp-v0`) * **Locomotion** base warp env (shared by ant/humanoid) * **InHand Manipulation** + **Allegro Hand** * Agent configs reference stable `isaaclab_tasks.direct.<env>.agents` directly — no duplication ### API adaptations for `develop` * `find_joints` 2-value return (indices, names) * `episode_length_buf` as property with in-place `copy_()` for warp/torch shared memory * `self._ALL_ENV_MASK` from base env * `set_joint_effort_target_mask` for CUDA graph compatibility * `_get_observations` returns `{"policy": tensor}` dict * `safe_normalize` to guard `wp.normalize` on zero-length vectors * Solver configs aligned with stable develop `PresetCfg` values ### Test results (rsl_rl, 4096 envs, 300 iterations, headless, `newton==1.0.0`) | Env | Status | Time | |-----|--------|------| | Cartpole | PASS | 70s | | Ant | PASS | 98s | | Humanoid | PASS | 172s | ## Test plan - [x] Cartpole: 300 iteration training converges - [x] Ant: 300 iteration training converges - [x] Humanoid: 300 iteration training converges --------- Signed-off-by: Antoine RICHARD <antoiner@nvidia.com> Co-authored-by: Antoine RICHARD <antoiner@nvidia.com>

github-actions Bot added documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team labels Mar 4, 2026

hujc7 force-pushed the develop-inhand-cp branch from 0c9f268 to 7c8e21d Compare March 5, 2026 07:54

hujc7 changed the title ~~Adds inhand manipulation warp env and warp-native actuator pipeline~~ Adds inhand manipulation warp env Mar 5, 2026

hujc7 marked this pull request as ready for review March 5, 2026 07:56

hujc7 requested review from ClemensSchwarke, Mayankm96 and ooctipus as code owners March 5, 2026 07:56

greptile-apps Bot reviewed Mar 5, 2026

View reviewed changes

hujc7 force-pushed the develop-inhand-cp branch from 7c8e21d to e200453 Compare March 5, 2026 09:09

greptile-apps Bot reviewed Mar 5, 2026

View reviewed changes

hujc7 mentioned this pull request Mar 5, 2026

[Exp] Cherry-pick manager-based warp env infrastructure from dev/newton #4829

Merged

1 task

hujc7 force-pushed the develop-inhand-cp branch from e200453 to 3270f85 Compare March 6, 2026 23:54

hujc7 requested a review from kellyguo11 as a code owner March 6, 2026 23:54

hujc7 force-pushed the develop-inhand-cp branch from 3270f85 to 3e6d953 Compare March 6, 2026 23:59

greptile-apps Bot reviewed Mar 7, 2026

View reviewed changes

hujc7 force-pushed the develop-inhand-cp branch from 3e6d953 to eb5c25e Compare March 7, 2026 19:26

greptile-apps Bot reviewed Mar 7, 2026

View reviewed changes

Adds inhand manipulation warp env and WarpGraphCache utility

0949f42

- Add WarpGraphCache capture-or-replay utility to isaaclab_experimental/utils - Refactor direct_rl_env_warp to use WarpGraphCache, split step_warp_end into pre/post to run write_data_to_sim uncaptured between graph segments

hujc7 force-pushed the develop-inhand-cp branch from eb5c25e to 0949f42 Compare March 7, 2026 20:02

greptile-apps Bot reviewed Mar 7, 2026

View reviewed changes

hujc7 mentioned this pull request Mar 11, 2026

[Exp] Cherry-pick direct warp envs from dev/newton #4905

Merged

3 tasks

hujc7 closed this Mar 11, 2026

		from isaaclab_experimental.envs import DirectRLEnvWarp

		from isaaclab.envs import DirectRLEnv, ManagerBasedEnv, ManagerBasedRLEnv

-        from isaaclab_experimental.envs import DirectRLEnvWarp
-        from isaaclab.envs import DirectRLEnv, ManagerBasedEnv, ManagerBasedRLEnv
+try:
+    from isaaclab_experimental.envs import DirectRLEnvWarp as _DirectRLEnvWarp
+    _warp_types = (_DirectRLEnvWarp,)
+except ImportError:
+    _warp_types = ()
+if not isinstance(env.unwrapped, (ManagerBasedRLEnv, ManagerBasedEnv, DirectRLEnv, *_warp_types)):
+    raise ValueError(
+        "The environment must be inherited from ManagerBasedRLEnv or DirectRLEnv. Environment type:"
+        f" {type(env)}"
+    )

-    packages=["isaaclab_experimental"],
+from setuptools import find_packages, setup
+setup(
+    # ... existing fields ...
+    packages=find_packages(),
+    # ... rest of setup ...
+)

-        wp.launch(
-            zero_mask_int32,
-            dim=self.num_envs,
-            inputs=[
-                mask,
-                self._episode_length_buf_wp,
-            ],
-        )
+def _reset_idx(self, mask: wp.array | torch.Tensor | None = None):
+    """Reset environments based on specified indices.
+    Args:
+        mask: A warp array or torch tensor indicating which environments to reset.
+    """
+    if mask is None:
+        mask = self._ALL_ENV_MASK
+    self.scene.reset(env_ids=None, env_mask=mask)
+    # ... rest of method ...

	python_requires=">=3.10",
	from setuptools import find_packages, setup

Conversation

hujc7 commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

isaaclab_experimental

isaaclab_newton

isaaclab_rl

isaaclab_tasks_experimental

scripts

Notes

Known Problems / TODOs

Test plan

Uh oh!

hujc7 commented Mar 5, 2026

Uh oh!

hujc7 commented Mar 5, 2026

Uh oh!

greptile-apps Bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hujc7 commented Mar 5, 2026

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hujc7 commented Mar 6, 2026

Uh oh!

greptile-apps Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

hujc7 commented Mar 4, 2026 •

edited

Loading

greptile-apps Bot commented Mar 5, 2026 •

edited

Loading