-
Notifications
You must be signed in to change notification settings - Fork 8
Enhance manager docs for action, reward and dataset #176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,97 @@ | ||
| # Action Functors | ||
|
|
||
| ```{currentmodule} embodichain.lab.gym.envs.managers | ||
| ``` | ||
|
|
||
| This page lists all available action terms that can be used with the Action Manager. Action terms are configured using {class}`~cfg.ActionTermCfg` and are responsible for processing raw actions from the policy and converting them to the format expected by the robot (e.g., qpos, qvel, qf). | ||
|
|
||
| ## Joint Position Control | ||
|
|
||
| ```{list-table} Joint Position Action Terms | ||
| :header-rows: 1 | ||
| :widths: 30 70 | ||
|
|
||
| * - Action Term | ||
| - Description | ||
| * - ``DeltaQposTerm`` | ||
| - Delta joint position action: current_qpos + scale * action -> qpos. The policy outputs position deltas relative to the current joint positions. | ||
| * - ``QposTerm`` | ||
| - Absolute joint position action: scale * action -> qpos. The policy outputs direct target joint positions. | ||
| * - ``QposNormalizedTerm`` | ||
| - Normalized action in [-1, 1] -> denormalize to joint limits -> qpos. The policy outputs normalized values that are mapped to joint limits. With scale=1.0 (default), action in [-1, 1] maps to [low, high]. | ||
| ``` | ||
|
|
||
| ## End-Effector Control | ||
|
|
||
| ```{list-table} End-Effector Action Terms | ||
| :header-rows: 1 | ||
| :widths: 30 70 | ||
|
|
||
| * - Action Term | ||
| - Description | ||
| * - ``EefPoseTerm`` | ||
| - End-effector pose (6D or 7D) -> IK -> qpos. The policy outputs target end-effector poses which are converted to joint positions via inverse kinematics. Returns ``ik_success`` in the output so reward/observation can penalize or condition on IK failures. Supports both 6D (euler angles) and 7D (quaternion) pose representations. | ||
| ``` | ||
|
|
||
| ## Velocity and Force Control | ||
|
|
||
| ```{list-table} Velocity and Force Action Terms | ||
| :header-rows: 1 | ||
| :widths: 30 70 | ||
|
|
||
| * - Action Term | ||
| - Description | ||
| * - ``QvelTerm`` | ||
| - Joint velocity action: scale * action -> qvel. The policy outputs target joint velocities. | ||
| * - ``QfTerm`` | ||
| - Joint force/torque action: scale * action -> qf. The policy outputs target joint torques/forces. | ||
| ``` | ||
|
|
||
| ## Usage Example | ||
|
|
||
| ```python | ||
| from embodichain.lab.gym.envs.managers.cfg import ActionTermCfg | ||
|
|
||
| # Example: Delta joint position control | ||
| actions = { | ||
| "joint_position": ActionTermCfg( | ||
| func="embodichain.lab.gym.envs.managers.action_manager.DeltaQposTerm", | ||
| params={ | ||
| "scale": 0.1, # Scale factor for action deltas | ||
| }, | ||
| ), | ||
|
Comment on lines
+56
to
+62
|
||
| } | ||
|
|
||
| # Example: Normalized joint position control | ||
| actions = { | ||
| "normalized_joint_position": ActionTermCfg( | ||
| func="embodichain.lab.gym.envs.managers.action_manager.QposNormalizedTerm", | ||
| params={ | ||
| "scale": 1.0, # Full joint range utilization | ||
| }, | ||
| ), | ||
| } | ||
|
|
||
| # Example: End-effector pose control | ||
| actions = { | ||
| "eef_pose": ActionTermCfg( | ||
| func="embodichain.lab.gym.envs.managers.action_manager.EefPoseTerm", | ||
| params={ | ||
| "scale": 0.1, | ||
| "pose_dim": 7, # 7D (position + quaternion) | ||
| }, | ||
| ), | ||
| } | ||
| ``` | ||
|
|
||
| ## Action Term Properties | ||
|
|
||
| All action terms provide the following properties: | ||
|
|
||
| - ``action_dim``: The dimension of the action space (number of values the policy should output) | ||
| - ``process_action(action)``: Method to convert raw policy output to robot control format | ||
|
|
||
| The Action Manager also provides: | ||
|
|
||
| - ``total_action_dim``: Total dimension of all action terms combined | ||
| - ``action_type``: The active action type (term name) for backward compatibility | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,123 @@ | ||
| # Dataset Functors | ||
|
|
||
| ```{currentmodule} embodichain.lab.gym.envs.managers | ||
| ``` | ||
|
|
||
| This page lists all available dataset functors that can be used with the Dataset Manager. Dataset functors are configured using {class}`~cfg.DatasetFunctorCfg` and are responsible for collecting and saving episode data during environment interaction. | ||
|
|
||
| ## Recording Functors | ||
|
|
||
| ```{list-table} Dataset Recording Functors | ||
| :header-rows: 1 | ||
| :widths: 30 70 | ||
|
|
||
| * - Functor Name | ||
| - Description | ||
| * - ``LeRobotRecorder`` | ||
| - Records episodes in LeRobot dataset format. Handles observation-action pair recording, format conversion, and episode saving. Requires LeRobot package to be installed. | ||
| ``` | ||
|
|
||
| ## LeRobotRecorder | ||
|
|
||
| The ``LeRobotRecorder`` functor enables recording robot learning episodes in the LeRobot dataset format, which can be used for training with LeRobot's imitation learning algorithms. | ||
|
|
||
| ### Features | ||
|
|
||
| - Records observation-action pairs during episodes | ||
| - Converts data to LeRobot format automatically | ||
| - Saves episodes when they complete | ||
| - Supports vision sensors (camera images) | ||
| - Supports robot state (qpos, qvel, qf) | ||
| - Supports custom observation features | ||
| - Auto-incrementing dataset naming | ||
|
|
||
| ### Parameters | ||
|
|
||
| ```{list-table} LeRobotRecorder Parameters | ||
| :header-rows: 1 | ||
| :widths: 30 70 | ||
|
|
||
| * - Parameter | ||
| - Description | ||
| * - ``save_path`` | ||
| - Root directory for saving datasets. Defaults to EmbodiChain's default dataset root. | ||
| * - ``robot_meta`` | ||
| - Robot metadata for dataset (robot_type, control_freq, etc.) | ||
| * - ``instruction`` | ||
| - Optional task instruction (e.g., {"lang": "pick the cube"}) | ||
| * - ``extra`` | ||
| - Optional extra metadata (scene_type, task_description, episode_info) | ||
| * - ``use_videos`` | ||
| - Whether to save videos (True) or images (False). Default: False. | ||
| * - ``image_writer_threads`` | ||
| - Number of threads for image writing | ||
| * - ``image_writer_processes`` | ||
| - Number of processes for image writing | ||
| ``` | ||
|
|
||
| ### Recorded Data | ||
|
|
||
| The LeRobotRecorder saves the following data for each frame: | ||
|
|
||
| - ``observation.qpos``: Joint positions | ||
| - ``observation.qvel``: Joint velocities | ||
| - ``observation.qf``: Joint forces/torques | ||
| - ``action``: Applied action | ||
| - ``{sensor_name}.color``: Camera images (if sensors present) | ||
| - ``{sensor_name}.color_right``: Right camera images (for stereo cameras) | ||
|
|
||
| ## Usage Example | ||
|
|
||
| ```python | ||
| from embodichain.lab.gym.envs.managers.cfg import DatasetFunctorCfg | ||
|
|
||
| # Example: Record episodes in LeRobot format | ||
| dataset = { | ||
| "lerobot_recorder": DatasetFunctorCfg( | ||
| func="embodichain.lab.gym.envs.managers.datasets.LeRobotRecorder", | ||
|
Comment on lines
+73
to
+77
|
||
| params={ | ||
| "save_path": "/path/to/dataset/root", | ||
| "robot_meta": { | ||
| "robot_type": "dexforce_w1", | ||
| "control_freq": 30, | ||
| }, | ||
| "instruction": { | ||
| "lang": "pick the cube and place it on the target", | ||
| }, | ||
| "extra": { | ||
| "scene_type": "table", | ||
| "task_description": "pick_and_place", | ||
| "episode_info": { | ||
| "rigid_object_physics_attributes": ["mass"], | ||
| }, | ||
| }, | ||
| "use_videos": False, | ||
| }, | ||
| ), | ||
| } | ||
| ``` | ||
|
|
||
| ### Recording Workflow | ||
|
|
||
| 1. **Initialization**: The Dataset Manager initializes the functor with the configured parameters | ||
| 2. **Data Collection**: During episode rollout, the functor receives observations and actions | ||
| 3. **Save Trigger**: When an episode completes, call the functor with `mode="save"` | ||
| 4. **Finalization**: After all episodes, call `finalize()` to save any remaining data | ||
|
|
||
| ```python | ||
| # Inside environment loop | ||
| if episode_done: | ||
| dataset_manager.apply(mode="save", env_ids=completed_env_ids) | ||
|
|
||
| # After training completes | ||
| dataset_manager.apply(mode="finalize") | ||
| ``` | ||
|
Comment on lines
+100
to
+114
|
||
|
|
||
| ## Dataset Manager Modes | ||
|
|
||
| The Dataset Manager supports the following modes: | ||
|
|
||
| - ``save``: Save completed episodes for specified environment IDs | ||
| - ``finalize``: Finalize the dataset and save any remaining data | ||
|
|
||
yuecideng marked this conversation as resolved.
Show resolved
Hide resolved
Comment on lines
+116
to
+122
|
||
| See {class}`~managers.dataset_manager.DatasetManager` for more details. | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.