From fc2c3215ee1811d43e2ba08118c0911254c30d4d Mon Sep 17 00:00:00 2001 From: jichuanh Date: Fri, 8 May 2026 21:48:58 +0000 Subject: [PATCH 1/4] Add warp environment docs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two new pages under experimental-features/newton-physics-integration/: - warp-environments.rst: overview of the experimental warp env path — the two workflows (direct, manager-based), task inventory, quick start, performance comparison vs the stable variants, which workflows benefit most, limitations (Newton-only physics, MDP coverage, kit sensor restrictions, capture-safety constraints, env_mask vs env_ids API delta), benchmarking how-to, and a checklist for adding new warp environments. - warp-env-migration.rst: pytorch -> warp migration guide. Covers the CUDA graph capture rationale, project layout, the kernel + launch pattern shared by all term types, observation dim resolution, the env_mask / env_ids switch for events, capture-safety rules, two-level parity testing (stable vs warp, and warp vs warp-captured), and the inventory of currently-implemented warp MDP terms. Both pages register in the section toctree. --- .../newton-physics-integration/index.rst | 3 + .../warp-env-migration.rst | 280 +++++++++++++++ .../warp-environments.rst | 331 ++++++++++++++++++ 3 files changed, 614 insertions(+) create mode 100644 docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst create mode 100644 docs/source/experimental-features/newton-physics-integration/warp-environments.rst diff --git a/docs/source/experimental-features/newton-physics-integration/index.rst b/docs/source/experimental-features/newton-physics-integration/index.rst index afe783cc8716..52df7a68f512 100644 --- a/docs/source/experimental-features/newton-physics-integration/index.rst +++ b/docs/source/experimental-features/newton-physics-integration/index.rst @@ -38,6 +38,9 @@ For an overview of how the multi-backend architecture works, including how to ad :titlesonly: installation + warp-environments + training-environments + visualization limitations-and-known-bugs solver-transitioning using-kamino diff --git a/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst new file mode 100644 index 000000000000..b47c2c50282d --- /dev/null +++ b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst @@ -0,0 +1,280 @@ +.. _warp-env-migration: + +Warp Environment Migration Guide +================================ + +This guide covers the key conventions and patterns used by the warp-first environment +infrastructure, useful for migrating existing stable environments or creating new ones +natively. For an overview of the warp env path itself (workflows, available envs, +performance, limitations, benchmarking), see :doc:`warp-environments`. + + +Design Rationale +~~~~~~~~~~~~~~~~ + +The warp environment path is built around `CUDA graph capture +`_. +A CUDA graph records a sequence of GPU operations (kernel launches, memory copies) during a +capture phase, then replays the entire sequence with a single launch. This eliminates per-kernel +CPU overhead — the parameter validation, kernel selection, and buffer setup that normally costs +20–200 μs per operation is performed once during graph instantiation and reused on every replay +(~10 μs total). All CPU-side code (Python logic, torch dispatching) executed during capture is +completely bypassed during replay. See the `Warp concurrency documentation +`_ for Warp's graph capture API +(``wp.ScopedCapture``). + +All design decisions in the warp infrastructure follow from this constraint: every operation in the +step loop must be a GPU kernel launch with stable memory pointers so that the captured graph can +be replayed without modification. + +Key consequences: + +- All buffers are **pre-allocated** — no dynamic allocation inside the step loop +- Data flows through **persistent ``wp.array`` pointers** — never replaced, only overwritten +- MDP terms are **pure ``@wp.kernel`` functions** — no Python branching on GPU data +- Reset uses **boolean masks** (``env_mask``) instead of index lists (``env_ids``) to avoid + variable-length indexing that changes graph topology + + +Project Structure +~~~~~~~~~~~~~~~~~ + +Warp-specific implementations that deviate from stable live in the ``_experimental`` packages: + +- ``isaaclab_experimental`` — warp managers, base env classes, warp MDP terms +- ``isaaclab_tasks_experimental`` — warp task configs and task-specific MDP terms + +Any new warp implementation that differs from the stable API belongs in these packages. +Warp task configs reference Newton physics directly (no ``PresetCfg``) since the warp path +is Newton-only. + + +Writing Warp MDP Terms +~~~~~~~~~~~~~~~~~~~~~~ + +Imports +^^^^^^^ + +Warp task configs import from the experimental packages: + +.. code-block:: python + + # Warp + from isaaclab_experimental.managers import ObservationTermCfg, RewardTermCfg, SceneEntityCfg + import isaaclab_experimental.envs.mdp as mdp + +The term config classes have the same interface — only the import path changes. + + +Common Pattern +^^^^^^^^^^^^^^ + +All warp MDP terms (observations, rewards, terminations, events, actions) follow the same +**kernel + launch** pattern. Stable terms use torch tensors and return results; warp terms +write into pre-allocated ``wp.array`` output buffers via ``@wp.kernel`` functions: + +.. code-block:: python + + # Stable — returns a tensor + def lin_vel_z_l2(env, asset_cfg) -> torch.Tensor: + return torch.square(asset.data.root_lin_vel_b[:, 2]) + + # Warp — writes into pre-allocated output + @wp.kernel + def _lin_vel_z_l2_kernel(vel: wp.array(...), out: wp.array(dtype=wp.float32)): + i = wp.tid() + out[i] = vel[i][2] * vel[i][2] + + def lin_vel_z_l2(env, out, asset_cfg) -> None: + wp.launch(_lin_vel_z_l2_kernel, dim=env.num_envs, inputs=[..., out]) + +The output buffer shapes differ by term type: + +- **Observations**: ``(num_envs, D)`` where D is the observation dimension +- **Rewards**: ``(num_envs,)`` +- **Terminations**: ``(num_envs,)`` with dtype ``bool`` +- **Events**: ``(num_envs,)`` mask — events don't produce output, they modify sim state + + +Observation Terms +^^^^^^^^^^^^^^^^^ + +Since warp terms write into pre-allocated buffers, the observation manager must know each +term's output dimension at initialization to allocate the correct ``(num_envs, D)`` output +array. This is resolved via a fallback chain (see +``ObservationManager._infer_term_dim_scalar`` in +``isaaclab_experimental/managers/observation_manager.py``): + +1. **Explicit ``out_dim`` in decorator** (preferred): + + .. code-block:: python + + @generic_io_descriptor_warp(out_dim=3, observation_type="RootState") + def base_lin_vel(env, out, asset_cfg) -> None: ... + + ``out_dim`` can be an integer, or a string that resolves at initialization: + + - ``"joint"`` — number of selected joints from ``asset_cfg`` + - ``"body:N"`` — N components per selected body from ``asset_cfg`` + - ``"command"`` — dimension from command manager + - ``"action"`` — dimension from action manager + +2. **``axes`` metadata**: Dimension equals the number of axes listed: + + .. code-block:: python + + @generic_io_descriptor_warp(axes=["X", "Y", "Z"], observation_type="RootState") + def projected_gravity(env, out, asset_cfg) -> None: ... + # → dimension = 3 + +3. **Legacy params**: ``term_dim``, ``out_dim``, or ``obs_dim`` keys in ``term_cfg.params``. + +4. **Asset config fallback**: Count of ``asset_cfg.joint_ids`` (or ``joint_ids_wp``) for + joint-level terms. + + +Event Terms +^^^^^^^^^^^ + +Events use ``env_mask`` (boolean ``wp.array``) instead of ``env_ids``, and each kernel +checks the mask to skip non-selected environments: + +.. code-block:: python + + def reset_joints_by_offset(env, env_mask, ...): + wp.launch(_kernel, dim=env.num_envs, inputs=[env_mask, ...]) + + @wp.kernel + def _kernel(env_mask: wp.array(dtype=wp.bool), ...): + i = wp.tid() + if not env_mask[i]: + return + # ... modify state for selected envs only + +- RNG uses per-env ``env.rng_state_wp`` (``wp.uint32``) instead of ``torch.rand`` +- **Startup/prestartup** events use the stable convention ``(env, env_ids, **params)`` +- **Reset/interval** events use the warp convention ``(env, env_mask, **params)`` + + +Action Terms +^^^^^^^^^^^^ + +Actions follow a **two-stage execution**: ``process_actions`` (called once per env step) scales +and clips raw actions, and ``apply_actions`` (called once per sim step) writes targets to the +asset. Both stages use warp kernels with pre-allocated ``_raw_actions`` and ``_processed_actions`` +buffers. + + +Capture Safety +^^^^^^^^^^^^^^ + +When writing terms that run inside the captured step loop, keep in mind: + +- **No ``wp.to_torch``** or torch arithmetic — stay in warp throughout +- **No lazy-evaluated properties** — use sim-bound (Tier 1) data directly; if a derived + quantity is needed, compute it inline in the kernel +- **No dynamic allocation** — all buffers must be pre-allocated in ``__init__`` + + +Parity Testing +~~~~~~~~~~~~~~ + +Two levels of parity testing are used to validate warp terms: + +**1. Implementation parity (stable vs warp)** — verifies that the warp kernel produces the +same result as the stable torch implementation. This is optional for terms that have no stable +counterpart (e.g. new terms written directly in warp). + +.. code-block:: python + + import isaaclab.envs.mdp.observations as stable_obs + import isaaclab_experimental.envs.mdp.observations as warp_obs + + # Stable baseline + expected = stable_obs.joint_pos(stable_env, asset_cfg=cfg) + + # Warp (uncaptured) + out = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device) + warp_obs.joint_pos(warp_env, out, asset_cfg=cfg) + actual = wp.to_torch(out) + + torch.testing.assert_close(actual, expected) + +**2. Capture parity (warp vs warp-captured)** — verifies that the term produces identical +results when replayed from a CUDA graph vs launched directly. A mismatch here indicates capture-unsafe +code (e.g. stale pointers, dynamic allocation, or lazy property access that doesn't replay). +This test should always be run, even for terms without a stable counterpart. + +.. code-block:: python + + # Warp uncaptured + out_uncaptured = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device) + warp_obs.joint_pos(warp_env, out_uncaptured, asset_cfg=cfg) + + # Warp captured (graph replay) + out_captured = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device) + with wp.ScopedCapture() as cap: + warp_obs.joint_pos(warp_env, out_captured, asset_cfg=cfg) + wp.capture_launch(cap.graph) + + torch.testing.assert_close(wp.to_torch(out_captured), wp.to_torch(out_uncaptured)) + +See ``source/isaaclab_experimental/test/envs/mdp/`` for complete parity test examples. + + +Available Warp MDP Terms +~~~~~~~~~~~~~~~~~~~~~~~~ + +.. list-table:: + :header-rows: 1 + :widths: 20 80 + + * - Category + - Available Terms + * - Observations (11) + - | ``base_pos_z`` + | ``base_lin_vel`` + | ``base_ang_vel`` + | ``projected_gravity`` + | ``joint_pos`` + | ``joint_pos_rel`` + | ``joint_pos_limit_normalized`` + | ``joint_vel`` + | ``joint_vel_rel`` + | ``last_action`` + | ``generated_commands`` + * - Rewards (16) + - | ``is_alive`` + | ``is_terminated`` + | ``lin_vel_z_l2`` + | ``ang_vel_xy_l2`` + | ``flat_orientation_l2`` + | ``joint_torques_l2`` + | ``joint_vel_l1`` + | ``joint_vel_l2`` + | ``joint_acc_l2`` + | ``joint_deviation_l1`` + | ``joint_pos_limits`` + | ``action_rate_l2`` + | ``action_l2`` + | ``undesired_contacts`` + | ``track_lin_vel_xy_exp`` + | ``track_ang_vel_z_exp`` + * - Events (6) + - | ``reset_joints_by_offset`` + | ``reset_joints_by_scale`` + | ``reset_root_state_uniform`` + | ``push_by_setting_velocity`` + | ``apply_external_force_torque`` + | ``randomize_rigid_body_com`` + * - Terminations (4) + - | ``time_out`` + | ``root_height_below_minimum`` + | ``joint_pos_out_of_manual_limit`` + | ``illegal_contact`` + * - Actions (2) + - | ``JointPositionAction`` + | ``JointEffortAction`` + +Terms not listed here remain in stable only. When using an env that requires unlisted terms, +those terms must be implemented in warp first. diff --git a/docs/source/experimental-features/newton-physics-integration/warp-environments.rst b/docs/source/experimental-features/newton-physics-integration/warp-environments.rst new file mode 100644 index 000000000000..c1107741239b --- /dev/null +++ b/docs/source/experimental-features/newton-physics-integration/warp-environments.rst @@ -0,0 +1,331 @@ +.. _warp-environments: + +Warp Experimental Environments +============================== + +.. note:: + + The warp environment infrastructure lives in ``isaaclab_experimental`` and + ``isaaclab_tasks_experimental``. It's an experimental feature. + +The experimental extensions introduce **warp-first** environment infrastructure with CUDA graph capture +support. All environment-side computation (observations, rewards, resets, actions) runs as pure Warp +kernels, eliminating Python overhead and enabling CUDA graph capture for maximum throughput. + + +Workflows +~~~~~~~~~ + +Two environment workflows are supported: + +**Direct workflow** — ``DirectRLEnvWarp`` base class. You implement the step loop, observations, +rewards, and resets directly in your env class using Warp kernels. + +**Manager-based workflow** — ``ManagerBasedRLEnvWarp`` base class. You define MDP terms as +standalone Warp-kernel functions and compose them via configuration. + + +Available Environments +~~~~~~~~~~~~~~~~~~~~~~ + +Direct Warp Environments +^^^^^^^^^^^^^^^^^^^^^^^^ + +- ``Isaac-Cartpole-Direct-Warp-v0`` — Cartpole balance +- ``Isaac-Ant-Direct-Warp-v0`` — Ant locomotion +- ``Isaac-Humanoid-Direct-Warp-v0`` — Humanoid locomotion +- ``Isaac-Repose-Cube-Allegro-Direct-Warp-v0`` — Allegro hand cube repose + + +Manager-Based Warp Environments +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +**Classic** + +- ``Isaac-Cartpole-Warp-v0`` +- ``Isaac-Ant-Warp-v0`` +- ``Isaac-Humanoid-Warp-v0`` + +**Locomotion (Flat)** + +- ``Isaac-Velocity-Flat-Anymal-B-Warp-v0`` +- ``Isaac-Velocity-Flat-Anymal-C-Warp-v0`` +- ``Isaac-Velocity-Flat-Anymal-D-Warp-v0`` +- ``Isaac-Velocity-Flat-Cassie-Warp-v0`` +- ``Isaac-Velocity-Flat-G1-Warp-v0`` +- ``Isaac-Velocity-Flat-G1-Warp-v1`` +- ``Isaac-Velocity-Flat-H1-Warp-v0`` +- ``Isaac-Velocity-Flat-Unitree-A1-Warp-v0`` +- ``Isaac-Velocity-Flat-Unitree-Go1-Warp-v0`` +- ``Isaac-Velocity-Flat-Unitree-Go2-Warp-v0`` + +**Manipulation** + +- ``Isaac-Reach-Franka-Warp-v0`` +- ``Isaac-Reach-UR10-Warp-v0`` + + +Quick Start +~~~~~~~~~~~ + +.. code-block:: bash + + # Direct workflow + ./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py \ + --task Isaac-Cartpole-Direct-Warp-v0 --num_envs 4096 --headless + + # Manager-based workflow + ./isaaclab.sh -p scripts/reinforcement_learning/rsl_rl/train.py \ + --task Isaac-Velocity-Flat-Anymal-C-Warp-v0 --num_envs 4096 --headless + +All RL libraries with warp-compatible wrappers are supported: RSL-RL, RL Games, SKRL, and +Stable-Baselines3. + + +Performance Comparison +~~~~~~~~~~~~~~~~~~~~~~ + +Step time comparison between the stable (torch/manager) and warp (CUDA graph captured) variants, +both running on the Newton physics backend. Measured over 300 iterations with 4096 environments. + +.. note:: + + The warp migration is an ongoing effort. Several components (e.g. scene write, actuator models) + have not yet been migrated to Warp kernels and still run through torch. Further performance + improvements are expected as these components are migrated. + +.. list-table:: + :header-rows: 1 + :widths: 30 12 15 15 12 + + * - Env + - Type + - Stable Step (us) + - Warp Step (us) + - Change + * - Cartpole-Direct + - Direct + - 5,274 + - 4,331 + - -17.88% + * - Ant-Direct + - Direct + - 6,368 + - 3,128 + - -50.88% + * - Humanoid-Direct + - Direct + - 13,937 + - 10,783 + - -22.63% + * - Allegro-Direct + - Direct + - 82,950 + - 74,570 + - -10.10% + * - Cartpole + - Manager + - 7,971 + - 3,642 + - -54.31% + * - Ant + - Manager + - 9,781 + - 4,672 + - -52.23% + * - Humanoid + - Manager + - 17,653 + - 12,505 + - -29.16% + * - Reach-Franka + - Manager + - 11,458 + - 7,813 + - -31.83% + * - Anymal-B + - Manager + - 29,188 + - 21,781 + - -25.38% + * - Anymal-C + - Manager + - 30,938 + - 22,228 + - -28.15% + * - Anymal-D + - Manager + - 32,294 + - 23,977 + - -25.75% + * - Cassie + - Manager + - 17,320 + - 10,706 + - -38.19% + * - G1 + - Manager + - 34,487 + - 27,300 + - -20.84% + * - H1 + - Manager + - 22,202 + - 15,864 + - -28.55% + * - A1 + - Manager + - 15,257 + - 9,907 + - -35.07% + * - Go1 + - Manager + - 16,515 + - 11,869 + - -28.13% + * - Go2 + - Manager + - 15,221 + - 9,966 + - -34.52% + + +Which Workflows Benefit Most +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The savings come from eliminating Python / torch overhead in the env's step loop, so envs +gain in proportion to how much of their step time was previously dominated by per-kernel CPU +overhead. Reading the table above: + +- **Manager-based classic RL** (Cartpole, Ant) — biggest gains (-52% to -54%). Many small + reward / observation terms with low compute per term, so per-launch CPU overhead dominated + the stable baseline. +- **Manager-based locomotion** (Anymal, G1, H1, Cassie, Unitree) — consistent -25% to -38% + range. The MDP has more terms but the underlying physics step is heavier, so the relative + Python savings shrink. +- **Direct workflow** — gains scale with how much the env's step body was Python (Ant -51%, + Cartpole -18%, Allegro hand -10%). Direct envs that already wrote most of their work as + GPU kernels see modest gains; ones with substantial Python state machinery see large ones. +- **Compute-heavy / scene-write-heavy envs** (Allegro hand, large humanoids) — see smaller + relative gains because the warp-side savings are amortised over a heavier step. Components + that still go through torch (scene write, actuator models) currently bound the floor; this + is expected to improve as remaining components migrate to warp. + +If your env's step time is dominated by physics or scene I/O, expect modest gains. If it has +many small MDP terms or a lot of Python in the step loop, expect large ones. Use the +benchmarking workflow below to measure on your task before committing to a migration. + + +Limitations +~~~~~~~~~~~ + +The warp env path is experimental and has the following known constraints. These are +specific to warp envs; for Newton physics limitations see :doc:`limitations-and-known-bugs`. + +**Physics backend** + +- **Newton only.** PhysX is not supported under the warp env path. Asset and sensor + ``class_type`` fields resolve to ``isaaclab_physx.*`` classes that depend on + ``omni.physics.tensors`` (a Kit module the warp runtime does not initialise), and several + warp APIs (env-mask reset, CUDA graph capture) require the Newton articulation. Configure + the cfg with a Newton physics block (or ``presets=newton``). + +**MDP coverage** + +- Only the terms listed under :ref:`Available Warp MDP Terms ` are + implemented. Stable envs that depend on un-migrated terms cannot be run on the warp path + until those terms are ported. +- Some scene-side operations (asset write, actuator models, certain sensor types) still go + through torch. They participate in the step but are not yet captured into the graph; they + set the lower bound on observed step time. +- Sensors that depend on the Kit RTX renderer (camera-based observations) cannot be combined + with the warp env path — they need Kit, which the warp runtime does not initialise. + +**API differences vs stable** + +- Reset events use a boolean ``env_mask`` (``wp.array(dtype=wp.bool)``) instead of an + ``env_ids`` list. This is required for capture safety: variable-length indexing changes + graph topology and breaks replay. +- All buffers must be pre-allocated in ``__init__``. There is no dynamic allocation inside + the captured step loop, so observation / reward / termination output dimensions must be + known at env init. +- Term functions write into a pre-allocated ``out`` buffer rather than returning a tensor. + See :doc:`warp-env-migration` for the kernel + launch pattern. +- Code inside the captured step loop must follow capture-safety rules (no + ``wp.to_torch``, no torch arithmetic, no lazy-evaluated properties, no Python branching + on GPU data). See the *Capture Safety* section in :doc:`warp-env-migration` for the + full set of rules. + + +Benchmarking Your Environment +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The performance table above was produced with ``scripts/benchmarks/benchmark_rsl_rl.py``, +which runs a fixed iteration count and reports step-time statistics. Use the same script +to estimate the gain for your own task before committing to a migration. + +**Single-task A/B** + +.. code-block:: bash + + # Stable variant + ./isaaclab.sh -p scripts/benchmarks/benchmark_rsl_rl.py \ + --task -v0 \ + --num_envs 4096 \ + --max_iterations 500 \ + --headless \ + --benchmark_backend summary \ + --output_path benchmarks/stable + + # Warp variant — same task with -Warp- suffix + ./isaaclab.sh -p scripts/benchmarks/benchmark_rsl_rl.py \ + --task -Warp-v0 \ + --num_envs 4096 \ + --max_iterations 500 \ + --headless \ + --benchmark_backend summary \ + --output_path benchmarks/warp + +The ``summary`` backend prints step time (mean / p50 / p99) and total throughput. Compare +"step time" between the two runs to estimate the gain per env step. + +**Sweep across all available tasks** + +``scripts/benchmarks/run_training_benchmarks.sh`` runs the full set of stable tasks listed +in the script (cartpole, ant, humanoid, locomotion, manipulation). Pair it with a +warp-tasks variant (substitute the ``-Warp-`` suffixed task ids) and diff the two outputs. + +**What to look at in the output** + +- *Step time (mean / p99)*: the headline number — what each env step costs. +- *Iteration time*: includes policy update; useful for end-to-end training throughput. +- *Capture overhead*: for warp runs, the first few iterations include CUDA graph capture + cost; exclude those when comparing steady-state numbers. + +**Estimating before you migrate** + +If you can't run the warp variant yet (e.g. the task isn't ported), measure the stable +step time and look at where it's spent: + +- ``num_envs * step_time`` dominated by physics → expect modest warp gains. +- ``step_time`` dominated by ``manager.compute_*`` calls → expect large gains, since those + are exactly what the warp managers replace with captured kernel launches. + +Use ``--num_frames`` on ``benchmark_non_rl.py`` for a no-policy step-time microbenchmark +when you want to isolate env overhead from policy compute. + + +Migrating Existing Environments +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For step-by-step instructions on porting an existing stable env (or writing a new warp +env from scratch) — covering project layout, the kernel + launch pattern shared by +observations / rewards / events / terminations / actions, capture-safety rules, and +parity testing — see :doc:`warp-env-migration` below. + + +.. toctree:: + :maxdepth: 2 + :hidden: + + warp-env-migration From 7283d24388157a63931512a9fcd804cddc70f870 Mon Sep 17 00:00:00 2001 From: jichuanh Date: Mon, 18 May 2026 03:22:56 +0000 Subject: [PATCH 2/4] [Docs] Fix newton toctree and clarify warp-env-migration wording - Drop unwritten 'training-environments' and 'visualization' entries from the newton-physics-integration toctree; add 'warp-env-migration' which is included in this PR. - In warp-env-migration.rst, refer to the non-experimental implementation as 'torch' rather than 'stable' so the warp/torch contrast is explicit. --- .../newton-physics-integration/index.rst | 3 +-- .../warp-env-migration.rst | 26 +++++++++---------- 2 files changed, 14 insertions(+), 15 deletions(-) diff --git a/docs/source/experimental-features/newton-physics-integration/index.rst b/docs/source/experimental-features/newton-physics-integration/index.rst index 52df7a68f512..b93c5a3fd2c0 100644 --- a/docs/source/experimental-features/newton-physics-integration/index.rst +++ b/docs/source/experimental-features/newton-physics-integration/index.rst @@ -39,8 +39,7 @@ For an overview of how the multi-backend architecture works, including how to ad installation warp-environments - training-environments - visualization + warp-env-migration limitations-and-known-bugs solver-transitioning using-kamino diff --git a/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst index b47c2c50282d..a88b0fae5e5c 100644 --- a/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst +++ b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst @@ -4,7 +4,7 @@ Warp Environment Migration Guide ================================ This guide covers the key conventions and patterns used by the warp-first environment -infrastructure, useful for migrating existing stable environments or creating new ones +infrastructure, useful for migrating existing torch environments or creating new ones natively. For an overview of the warp env path itself (workflows, available envs, performance, limitations, benchmarking), see :doc:`warp-environments`. @@ -39,12 +39,12 @@ Key consequences: Project Structure ~~~~~~~~~~~~~~~~~ -Warp-specific implementations that deviate from stable live in the ``_experimental`` packages: +Warp-specific implementations that diverge from the torch API live in the ``_experimental`` packages: - ``isaaclab_experimental`` — warp managers, base env classes, warp MDP terms - ``isaaclab_tasks_experimental`` — warp task configs and task-specific MDP terms -Any new warp implementation that differs from the stable API belongs in these packages. +Any new warp implementation that differs from the torch API belongs in these packages. Warp task configs reference Newton physics directly (no ``PresetCfg``) since the warp path is Newton-only. @@ -70,12 +70,12 @@ Common Pattern ^^^^^^^^^^^^^^ All warp MDP terms (observations, rewards, terminations, events, actions) follow the same -**kernel + launch** pattern. Stable terms use torch tensors and return results; warp terms +**kernel + launch** pattern. Torch terms use torch tensors and return results; warp terms write into pre-allocated ``wp.array`` output buffers via ``@wp.kernel`` functions: .. code-block:: python - # Stable — returns a tensor + # Torch — returns a tensor def lin_vel_z_l2(env, asset_cfg) -> torch.Tensor: return torch.square(asset.data.root_lin_vel_b[:, 2]) @@ -152,7 +152,7 @@ checks the mask to skip non-selected environments: # ... modify state for selected envs only - RNG uses per-env ``env.rng_state_wp`` (``wp.uint32``) instead of ``torch.rand`` -- **Startup/prestartup** events use the stable convention ``(env, env_ids, **params)`` +- **Startup/prestartup** events use the torch convention ``(env, env_ids, **params)`` - **Reset/interval** events use the warp convention ``(env, env_mask, **params)`` @@ -181,17 +181,17 @@ Parity Testing Two levels of parity testing are used to validate warp terms: -**1. Implementation parity (stable vs warp)** — verifies that the warp kernel produces the -same result as the stable torch implementation. This is optional for terms that have no stable +**1. Implementation parity (torch vs warp)** — verifies that the warp kernel produces the +same result as the torch implementation. This is optional for terms that have no torch counterpart (e.g. new terms written directly in warp). .. code-block:: python - import isaaclab.envs.mdp.observations as stable_obs + import isaaclab.envs.mdp.observations as torch_obs import isaaclab_experimental.envs.mdp.observations as warp_obs - # Stable baseline - expected = stable_obs.joint_pos(stable_env, asset_cfg=cfg) + # Torch baseline + expected = torch_obs.joint_pos(torch_env, asset_cfg=cfg) # Warp (uncaptured) out = wp.zeros((num_envs, num_joints), dtype=wp.float32, device=device) @@ -203,7 +203,7 @@ counterpart (e.g. new terms written directly in warp). **2. Capture parity (warp vs warp-captured)** — verifies that the term produces identical results when replayed from a CUDA graph vs launched directly. A mismatch here indicates capture-unsafe code (e.g. stale pointers, dynamic allocation, or lazy property access that doesn't replay). -This test should always be run, even for terms without a stable counterpart. +This test should always be run, even for terms without a torch counterpart. .. code-block:: python @@ -276,5 +276,5 @@ Available Warp MDP Terms - | ``JointPositionAction`` | ``JointEffortAction`` -Terms not listed here remain in stable only. When using an env that requires unlisted terms, +Terms not listed here remain in torch only. When using an env that requires unlisted terms, those terms must be implemented in warp first. From 8424278a9db3e65f663f42a6326a6db32ae9f10f Mon Sep 17 00:00:00 2001 From: jichuanh Date: Mon, 18 May 2026 03:29:51 +0000 Subject: [PATCH 3/4] [Docs] Fix Warp concurrency doc URL Use https://nvidia.github.io/warp/stable/deep_dive/concurrency.html (404 on the old non-stable path). --- .../newton-physics-integration/warp-env-migration.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst index a88b0fae5e5c..14097581d373 100644 --- a/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst +++ b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst @@ -20,7 +20,7 @@ CPU overhead — the parameter validation, kernel selection, and buffer setup th 20–200 μs per operation is performed once during graph instantiation and reused on every replay (~10 μs total). All CPU-side code (Python logic, torch dispatching) executed during capture is completely bypassed during replay. See the `Warp concurrency documentation -`_ for Warp's graph capture API +`_ for Warp's graph capture API (``wp.ScopedCapture``). All design decisions in the warp infrastructure follow from this constraint: every operation in the From 977808724cc31349f86c0a6b8129810de49e565d Mon Sep 17 00:00:00 2001 From: jichuanh Date: Mon, 18 May 2026 03:31:33 +0000 Subject: [PATCH 4/4] [Docs] Reword torch-API contrast to torch-based managers / env classes --- .../newton-physics-integration/warp-env-migration.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst index 14097581d373..468ced739b4a 100644 --- a/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst +++ b/docs/source/experimental-features/newton-physics-integration/warp-env-migration.rst @@ -39,12 +39,12 @@ Key consequences: Project Structure ~~~~~~~~~~~~~~~~~ -Warp-specific implementations that diverge from the torch API live in the ``_experimental`` packages: +Warp-specific implementations that diverge from the torch-based managers and env classes live in the ``_experimental`` packages: - ``isaaclab_experimental`` — warp managers, base env classes, warp MDP terms - ``isaaclab_tasks_experimental`` — warp task configs and task-specific MDP terms -Any new warp implementation that differs from the torch API belongs in these packages. +Any new warp implementation that differs from the torch-based managers or env classes belongs in these packages. Warp task configs reference Newton physics directly (no ``PresetCfg``) since the warp path is Newton-only.