Skip to content

Update skrl integration for version 2.0.0#5311

Merged
kellyguo11 merged 37 commits into
isaac-sim:mainfrom
Toni-SM:toni/skrl_2.0.0_part1
Apr 29, 2026
Merged

Update skrl integration for version 2.0.0#5311
kellyguo11 merged 37 commits into
isaac-sim:mainfrom
Toni-SM:toni/skrl_2.0.0_part1

Conversation

@Toni-SM
Copy link
Copy Markdown
Contributor

@Toni-SM Toni-SM commented Apr 19, 2026

Description

Updates the integration of skrl for the version 2.0.0.
This also solves #5300.

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label Apr 19, 2026
@Toni-SM Toni-SM linked an issue Apr 19, 2026 that may be closed by this pull request
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 19, 2026

Greptile Summary

This PR updates the skrl integration to support version 2.0.0, bumping the minimum version requirement and adapting all API call sites, agent configs, and wrapper logic to the new interface (renamed preprocessor fields, new enable_training_mode / agent.act signatures, warp backend support, and AMP config field renames).

  • P1 — skrl.py line 83: ValueError(...) is constructed but not raised in the else branch; invalid ml_framework values silently fall through, then crash with NameError: name 'wrap_env' is not defined instead of a useful message.
  • P2 — play.py / train.py: The "warp" backend is now handled by SkrlVecEnvWrapper but is not listed in the --ml_framework argparse choices, so it is inaccessible via the CLI.

Confidence Score: 4/5

Safe to merge after fixing the missing raise on ValueError in skrl.py.

One P1 bug (missing raise) causes a misleading runtime crash on any unsupported ml_framework value passed to SkrlVecEnvWrapper. All other changes are straightforward API-rename updates with no logic errors.

source/isaaclab_rl/isaaclab_rl/skrl.py (missing raise on ValueError, line 83)

Important Files Changed

Filename Overview
source/isaaclab_rl/isaaclab_rl/skrl.py Updated ml_framework type hint from 'jax-numpy' to 'warp' and added warp import branch; contains a P1 bug where ValueError is instantiated but not raised in the else branch.
scripts/reinforcement_learning/skrl/play.py Updated to skrl 2.0.0 API: bumped SKRL_VERSION, switched to enable_training_mode, added states tracking via env.state(), and updated agent.act signature; warp is missing from argparse choices despite being supported by the wrapper.
scripts/reinforcement_learning/skrl/train.py Updated SKRL_VERSION to 2.0.0 and --algorithm/--ml_framework arg definitions to match new open-ended algorithm support; functionally correct.
source/isaaclab_rl/setup.py Updated skrl dependency pin from >=1.4.3 to >=2.0.0.
source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand_over/agents/skrl_mappo_cfg.yaml Renamed shared_state_preprocessor to observation_preprocessor, corrected value model input from OBSERVATIONS to STATES for centralized critic in MAPPO.
source/isaaclab_tasks/isaaclab_tasks/direct/cart_double_pendulum/agents/skrl_ippo_cfg.yaml Renamed state_preprocessor to observation_preprocessor (set state_preprocessor: null), aligned with skrl 2.0.0 API naming.
source/isaaclab_tasks/isaaclab_tasks/direct/humanoid_amp/agents/skrl_dance_amp_cfg.yaml Renamed AMP preprocessor fields (amp_state_preprocessor → amp_observation_preprocessor, task_reward_weight → task_reward_scale, style_reward_weight → style_reward_scale), removed discriminator_reward_scale; style_reward_scale changed from 1.0 to 2.0.
source/isaaclab_tasks/isaaclab_tasks/direct/shadow_hand_over/agents/skrl_ippo_cfg.yaml Same state_preprocessor → observation_preprocessor rename as other IPPO configs.

Sequence Diagram

sequenceDiagram
    participant CLI as play.py / train.py
    participant Wrapper as SkrlVecEnvWrapper
    participant skrl as skrl library
    participant Runner as skrl Runner

    CLI->>Wrapper: SkrlVecEnvWrapper(env, ml_framework)
    alt ml_framework == torch
        Wrapper->>skrl: from skrl.envs.wrappers.torch import wrap_env
    else ml_framework starts with jax
        Wrapper->>skrl: from skrl.envs.wrappers.jax import wrap_env
    else ml_framework == warp
        Wrapper->>skrl: from skrl.envs.wrappers.warp import wrap_env
    else invalid
        Wrapper-->>CLI: ValueError not raised — P1 bug
    end
    Wrapper->>skrl: wrap_env(env, wrapper)
    skrl-->>CLI: wrapped env
    CLI->>Runner: Runner(env, cfg)
    Runner-->>CLI: runner
    Note over CLI: play.py only
    CLI->>CLI: env.reset() → obs
    CLI->>CLI: env.state() → states
    loop simulation
        CLI->>Runner: runner.agent.act(obs, states, ...)
        Runner-->>CLI: outputs
        CLI->>CLI: env.step(actions) → obs
        CLI->>CLI: env.state() → states
    end
Loading

Comments Outside Diff (1)

  1. scripts/reinforcement_learning/skrl/play.py, line 141-142 (link)

    P2 Dead else branch for JAX numpy backend

    The condition "jax" if args_cli.ml_framework == "jax" else "numpy" was originally designed to handle both "jax" and "jax-numpy". Since "jax-numpy" was removed from the CLI choices in this PR, the "numpy" branch is now unreachable. The jax backend will always be set to "jax". Consider simplifying this line, or documenting that the numpy backend is no longer user-accessible. The same pattern is present in train.py.

Reviews (1): Last reviewed commit: "Relax the --algorithm argument" | Re-trigger Greptile

Comment thread source/isaaclab_rl/isaaclab_rl/skrl.py Outdated
Comment thread scripts/reinforcement_learning/skrl/play.py
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR updates Isaac Lab's skrl integration from 1.4.3 to 2.0.0, covering the breaking API changes. The changes are well-targeted and correct for the files touched: state_preprocessorobservation_preprocessor rename for single-agent configs, set_running_modeenable_training_mode, new states parameter in agent.act(), and AMP reward parameter renames.

Verdict: Changes are correct. One concern about migration scope.


Findings

1. ⚠️ Incomplete state_preprocessorobservation_preprocessor migration across YAML configs

The PR updates 7 out of ~77 skrl YAML configs that use state_preprocessor. In skrl 2.0.0, state_preprocessor now specifically preprocesses MARL global states — not observations. For single-agent envs where states is None, the preprocessor becomes a no-op, meaning observations go unprocessed (no RunningStandardScaler normalization).

This won't crash, but it silently degrades training quality for every single-agent skrl task in Isaac Lab (ant, anymal_c, cartpole, humanoid, G1, H1, go2, franka reach/cabinet, shadow_hand, etc.).

Is this being handled in a follow-up PR?

2. ✅ Pre-existing raise ValueError bug fixed

Good catch — the raise keyword was missing on the framework validation error.

3. 💡 Missing CHANGELOG and version bump

Since the minimum skrl version jumps from 1.4.3 → 2.0.0 (breaking), this warrants:

  • source/isaaclab_rl/docs/CHANGELOG.rst entry
  • Version bump in source/isaaclab_rl/config/extension.toml

4. 💡 jax-numpy removal from CLI choices

The --ml_framework choices drop jax-numpy. Users with scripts using --ml_framework jax-numpy will get argparse errors. Worth a note in the CHANGELOG.

Comment thread source/isaaclab_rl/isaaclab_rl/skrl.py
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR correctly updates Isaac Lab's skrl integration from 1.4.3 to 2.0.0, addressing the breaking API changes. The core changes are well-implemented:

  • set_running_mode("eval")enable_training_mode(False, apply_to_models=True)
  • ✅ New states parameter in agent.act() and env.state() calls
  • ✅ AMP config renames: state_preprocessorobservation_preprocessor, amp_state_preprocessoramp_observation_preprocessor, task_reward_weight/style_reward_weighttask_reward_scale/style_reward_scale
  • ✅ MAPPO critic input OBSERVATIONSSTATES
  • ✅ Missing raise keyword on ValueError fixed
  • ✅ Minimum skrl version updated to 2.0.0

Verdict: Approve with minor suggestions

The implementation is correct. No bugs or logic errors. Two process items to consider:


Findings

1. 💡 Partial config migration (by design?)

The PR updates 7 YAML configs (3 AMP, 2 IPPO, 2 MAPPO), but leaves ~67 other skrl PPO configs unchanged. For example, skrl_ppo_cfg.yaml in the same directories (cart_double_pendulum, shadow_hand_over) still have only state_preprocessor: RunningStandardScaler without observation_preprocessor.

Impact: In skrl 2.0.0, state_preprocessor processes MARL global states (privileged observations), not regular observations. For single-agent envs where states=None, this becomes a no-op, meaning observations go un-normalized. This silently degrades training quality.

Question: Is there a follow-up PR planned (hence the branch name toni/skrl_2.0.0_part1)? If so, this is fine as-is. If not, the remaining PPO configs should be migrated.

2. 💡 Missing CHANGELOG.rst and version bump

Since the minimum skrl version changes from 1.4.3 → 2.0.0 (a breaking change for users), this warrants:

  • An entry in source/isaaclab_rl/docs/CHANGELOG.rst
  • A version bump in source/isaaclab_rl/config/extension.toml

The PR checklist shows these items are unchecked.

3. 📝 jax-numpy removal

The --ml_framework CLI choices drop jax-numpy. Users with existing scripts using this option will get argparse errors. Consider noting this in the CHANGELOG.


Code Quality

The implementation follows established patterns and is consistent with the codebase. No additional inline comments needed - the changes are straightforward and correct.

@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Apr 20, 2026
@kellyguo11 kellyguo11 moved this to In review in Isaac Lab Apr 20, 2026
Comment thread source/isaaclab_tasks/docs/CHANGELOG.rst Outdated
Signed-off-by: Kelly Guo <kellyg@nvidia.com>
@isaaclab-review-bot
Copy link
Copy Markdown

Doc Build Failure Analysis

The Build Latest Docs CI failure appears to be caused by this PR.

Root Cause

The PR bumps skrl>=1.4.3skrl>=2.0.0. When Sphinx imports isaaclab_rl.skrl during doc generation, it pulls in skrl 2.0.0, which uses Python X | Y type union syntax at runtime:

TypeError: unsupported operand type(s) for |: 'typing.Any' and 'types.GenericAlias'

This syntax requires Python 3.10+, but the doc build environment appears to use an earlier version. The import failure cascades to every module that transitively imports from isaaclab_rl:

  • isaaclab.envs
  • isaaclab_tasks
  • isaaclab_rl.skrl, isaaclab_rl.rl_games, isaaclab_rl.sb3
  • isaaclab_mimic.envs, isaaclab_mimic.datagen

Suggested Fixes

  1. Add skrl to autodoc_mock_imports in docs/conf.py — this prevents Sphinx from actually importing skrl during doc generation (similar to how gymnasium was recently added in Adds presets to environment docs and fix doc build issues #5360)
  2. Alternatively, check if skrl 2.0.0 is missing from __future__ import annotations in the offending module — if so, this may need to be fixed upstream in skrl
  3. Long-term: ensure the doc build CI uses Python 3.10+

@Toni-SM — could you check if skrl 2.0.0 has from __future__ import annotations in all modules? The | syntax works fine at type-check time but breaks at runtime on Python <3.10.

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

All previous concerns have been addressed in this update. The PR now includes the CHANGELOG entries and version bumps for both isaaclab_rl (0.5.0 → 0.5.1) and isaaclab_tasks (0.11.14 → 0.11.16).

Implementation Verdict

Ship it ✅

Findings

No new issues. The follow-up commits correctly added:

  • source/isaaclab_rl/docs/CHANGELOG.rst entry for 0.5.1
  • source/isaaclab_tasks/docs/CHANGELOG.rst entry for 0.11.16
  • Version bumps in both extension.toml files

The partial config migration (only multi-agent configs updated) appears intentional given the branch name part1, which is acceptable for incremental rollout.

@kellyguo11 kellyguo11 merged commit d94504b into isaac-sim:main Apr 29, 2026
10 checks passed
@github-project-automation github-project-automation Bot moved this from In review to Done in Isaac Lab Apr 29, 2026
@kellyguo11 kellyguo11 mentioned this pull request Apr 29, 2026
7 tasks
kellyguo11 added a commit that referenced this pull request Apr 30, 2026
# Description

* #5311 - Update skrl
integration for version 2.0.0

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- Bug fix (non-breaking change which fixes an issue)
- New feature (non-breaking change which adds functionality)
- Breaking change (existing functionality will not work without user
modification)
- Documentation update


## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [ ] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there

<!--
As you go through the checklist above, you can mark something as done by
putting an x character in it

For example,
- [x] I have done this task
- [ ] I have not done this task
-->

---------

Signed-off-by: Antoine RICHARD <antoiner@nvidia.com>
Signed-off-by: bxwang <bixiong.wang@x-humanoid.com>
Signed-off-by: bixiong wang <wangbx02@126.com>
Signed-off-by: Kelly Guo <kellyg@nvidia.com>
Co-authored-by: nv-caelan <cgarrett@nvidia.com>
Co-authored-by: Clemens Schwarke <96480707+ClemensSchwarke@users.noreply.github.com>
Co-authored-by: Kelly Guo <kellyguo11@users.noreply.github.com>
Co-authored-by: Antoine RICHARD <antoiner@nvidia.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: bixiong wang <wangbx02@126.com>
Co-authored-by: bxwang <bixiong.wang@x-humanoid.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: OpenClaw <openclaw@users.noreply.github.com>
Co-authored-by: Toni-SM <aserranomuno@nvidia.com>
AntoineRichard added a commit to AntoineRichard/IsaacLab that referenced this pull request Apr 30, 2026
DirectMARLEnv envs (cart_double_pendulum, shadow_hand_over) crashed
out of the gate on benchmark_startup:

    AttributeError: 'CartDoublePendulumEnv' object has no attribute
                    'single_action_space'

Multi-agent envs expose ``action_spaces`` (plural, dict keyed by
agent id) instead of ``single_action_space``. The first-step random
action sample needs the matching shape — a per-agent dict of stacked
tensors — and ``env.step`` accepts that shape directly.

Extract the action sampler to ``_action_sampling.py`` so it has unit
tests covering both single-agent and multi-agent shapes (gym wrappers,
heterogeneous per-agent action dims, in-bounds samples, per-env
independence). Discriminator is a duck-type on
``unwrapped.action_spaces`` so the helper doesn't need to import
isaaclab — keeps tests fast.

benchmark_skrl.py also gets a small auto-promotion: if the user
leaves the default ``--algorithm ppo`` AND the task is multi-agent,
silently switch to ``ippo``. Plain PPO can't train a DirectMARLEnv
(its observation preprocessor and policy assume Tensor obs, not
dict), and the per-task yaml entry points already advertise
``skrl_ippo_cfg_entry_point`` for these tasks. Explicit
``--algorithm mappo`` etc. is honoured as-is.

Together with the prior ``state_preprocessor → observation_preprocessor``
yaml rename (cherry-pick from PR isaac-sim#5311), this unblocks
cart_double_pendulum and shadow_hand_over end-to-end on the
benchmark suite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Question] 'PPO' object has no attribute 'set_running_mode'

2 participants