Implemented an action manager to process action and replace RLEnv by EmbodiEnv#164
Implemented an action manager to process action and replace RLEnv by EmbodiEnv#164yangchen73 merged 13 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the action handling in the EmbodiChain RL stack by introducing a modular ActionManager (similar to IsaacLab) to replace the deleted RLEnv base class. It also includes a bug fix that moves _elapsed_steps increment to occur before the episode truncation check.
Changes:
- New
ActionManager+ concreteActionTermsubclasses (DeltaQposTerm,QposTerm,QposNormalizedTerm,EefPoseTerm,QvelTerm,QfTerm) added to the manager system RLEnvdeleted; itscompute_task_state,get_info,evaluate, and_preprocess_actionmethods moved intoEmbodiedEnv; all RL task environments updated to inherit fromEmbodiedEnvdirectly_elapsed_stepsincrement moved before themax_episode_stepstruncation check inBaseEnv.step()to fix off-by-one episode boundary behavior
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
embodichain/lab/gym/envs/managers/action_manager.py |
New file: ActionManager and ActionTerm base + concrete implementations |
embodichain/lab/gym/envs/managers/cfg.py |
Adds ActionTermCfg dataclass |
embodichain/lab/gym/envs/managers/__init__.py |
Exports new action manager classes |
embodichain/lab/gym/envs/embodied_env.py |
Integrates ActionManager; moves RL utility methods from deleted RLEnv |
embodichain/lab/gym/envs/base_env.py |
Fixes _elapsed_steps increment ordering |
embodichain/lab/gym/envs/rl_env.py |
Deleted: entire file removed |
embodichain/lab/gym/envs/__init__.py |
Removes RLEnv imports |
embodichain/lab/gym/envs/tasks/rl/push_cube.py |
Updates to inherit from EmbodiedEnv |
embodichain/lab/gym/envs/tasks/rl/basic/cart_pole.py |
Updates to inherit from EmbodiedEnv |
embodichain/lab/gym/utils/gym_utils.py |
Adds actions config parsing in config_to_cfg |
embodichain/agents/rl/algo/ppo.py |
Uses action_manager.action_type when available |
embodichain/agents/rl/algo/grpo.py |
Uses action_manager.action_type when available |
embodichain/agents/rl/utils/trainer.py |
Uses action_manager.action_type when available |
configs/agents/rl/push_cube/gym_config.json |
Migrates action config to actions section |
configs/agents/rl/basic/cart_pole/gym_config.json |
Migrates action config to actions section |
docs/source/tutorial/rl.rst |
Updates docs for new Action Manager API |
docs/source/overview/gym/env.md |
Updates docs for new Action Manager API |
CONTRIBUTING.md / CLAUDE.md |
Updates project structure descriptions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """Dimension of the action term (policy output dimension).""" | ||
| raise NotImplementedError | ||
|
|
||
| def process_action(self, action: torch.Tensor) -> EnvAction: |
There was a problem hiding this comment.
Add abstractmethod to this method to enforce the inherent class to implment it.
| action_to_store.to(buffer_device), non_blocking=True | ||
| ) | ||
| else: | ||
| logger.log_error( |
There was a problem hiding this comment.
log_error will raise exception and the progrom will stop. If you still want to keep the program to run, use log_warning instead. Otherwise we should keep use log_error
There was a problem hiding this comment.
Okay, I'v changed it to log_error instead of stopping the program.
Description
_elapsed_stepsimmediately after_post_stepso that themax_episode_stepstruncation check uses the newest step count and ends the episode at the correct step.Type of change
Checklist
black .command to format the code base.