Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions docs/source/overview/gym/action_functors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Action Functors

```{currentmodule} embodichain.lab.gym.envs.managers
```

This page lists all available action terms that can be used with the Action Manager. Action terms are configured using {class}`~cfg.ActionTermCfg` and are responsible for processing raw actions from the policy and converting them to the format expected by the robot (e.g., qpos, qvel, qf).

## Joint Position Control

```{list-table} Joint Position Action Terms
:header-rows: 1
:widths: 30 70

* - Action Term
- Description
* - ``DeltaQposTerm``
- Delta joint position action: current_qpos + scale * action -> qpos. The policy outputs position deltas relative to the current joint positions.
* - ``QposTerm``
- Absolute joint position action: scale * action -> qpos. The policy outputs direct target joint positions.
* - ``QposNormalizedTerm``
- Normalized action in [-1, 1] -> denormalize to joint limits -> qpos. The policy outputs normalized values that are mapped to joint limits. With scale=1.0 (default), action in [-1, 1] maps to [low, high].
```

## End-Effector Control

```{list-table} End-Effector Action Terms
:header-rows: 1
:widths: 30 70

* - Action Term
- Description
* - ``EefPoseTerm``
- End-effector pose (6D or 7D) -> IK -> qpos. The policy outputs target end-effector poses which are converted to joint positions via inverse kinematics. Returns ``ik_success`` in the output so reward/observation can penalize or condition on IK failures. Supports both 6D (euler angles) and 7D (quaternion) pose representations.
```

## Velocity and Force Control

```{list-table} Velocity and Force Action Terms
:header-rows: 1
:widths: 30 70

* - Action Term
- Description
* - ``QvelTerm``
- Joint velocity action: scale * action -> qvel. The policy outputs target joint velocities.
* - ``QfTerm``
- Joint force/torque action: scale * action -> qf. The policy outputs target joint torques/forces.
```

## Usage Example

```python
from embodichain.lab.gym.envs.managers.cfg import ActionTermCfg

# Example: Delta joint position control
actions = {
"joint_position": ActionTermCfg(
func="embodichain.lab.gym.envs.managers.action_manager.DeltaQposTerm",
params={
"scale": 0.1, # Scale factor for action deltas
},
),
Comment on lines +56 to +62
}

# Example: Normalized joint position control
actions = {
"normalized_joint_position": ActionTermCfg(
func="embodichain.lab.gym.envs.managers.action_manager.QposNormalizedTerm",
params={
"scale": 1.0, # Full joint range utilization
},
),
}

# Example: End-effector pose control
actions = {
"eef_pose": ActionTermCfg(
func="embodichain.lab.gym.envs.managers.action_manager.EefPoseTerm",
params={
"scale": 0.1,
"pose_dim": 7, # 7D (position + quaternion)
},
),
}
```

## Action Term Properties

All action terms provide the following properties:

- ``action_dim``: The dimension of the action space (number of values the policy should output)
- ``process_action(action)``: Method to convert raw policy output to robot control format

The Action Manager also provides:

- ``total_action_dim``: Total dimension of all action terms combined
- ``action_type``: The active action type (term name) for backward compatibility
123 changes: 123 additions & 0 deletions docs/source/overview/gym/dataset_functors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Dataset Functors

```{currentmodule} embodichain.lab.gym.envs.managers
```

This page lists all available dataset functors that can be used with the Dataset Manager. Dataset functors are configured using {class}`~cfg.DatasetFunctorCfg` and are responsible for collecting and saving episode data during environment interaction.

## Recording Functors

```{list-table} Dataset Recording Functors
:header-rows: 1
:widths: 30 70

* - Functor Name
- Description
* - ``LeRobotRecorder``
- Records episodes in LeRobot dataset format. Handles observation-action pair recording, format conversion, and episode saving. Requires LeRobot package to be installed.
```

## LeRobotRecorder

The ``LeRobotRecorder`` functor enables recording robot learning episodes in the LeRobot dataset format, which can be used for training with LeRobot's imitation learning algorithms.

### Features

- Records observation-action pairs during episodes
- Converts data to LeRobot format automatically
- Saves episodes when they complete
- Supports vision sensors (camera images)
- Supports robot state (qpos, qvel, qf)
- Supports custom observation features
- Auto-incrementing dataset naming

### Parameters

```{list-table} LeRobotRecorder Parameters
:header-rows: 1
:widths: 30 70

* - Parameter
- Description
* - ``save_path``
- Root directory for saving datasets. Defaults to EmbodiChain's default dataset root.
* - ``robot_meta``
- Robot metadata for dataset (robot_type, control_freq, etc.)
* - ``instruction``
- Optional task instruction (e.g., {"lang": "pick the cube"})
* - ``extra``
- Optional extra metadata (scene_type, task_description, episode_info)
* - ``use_videos``
- Whether to save videos (True) or images (False). Default: False.
* - ``image_writer_threads``
- Number of threads for image writing
* - ``image_writer_processes``
- Number of processes for image writing
```

### Recorded Data

The LeRobotRecorder saves the following data for each frame:

- ``observation.qpos``: Joint positions
- ``observation.qvel``: Joint velocities
- ``observation.qf``: Joint forces/torques
- ``action``: Applied action
- ``{sensor_name}.color``: Camera images (if sensors present)
- ``{sensor_name}.color_right``: Right camera images (for stereo cameras)

## Usage Example

```python
from embodichain.lab.gym.envs.managers.cfg import DatasetFunctorCfg

# Example: Record episodes in LeRobot format
dataset = {
"lerobot_recorder": DatasetFunctorCfg(
func="embodichain.lab.gym.envs.managers.datasets.LeRobotRecorder",
Comment on lines +73 to +77
params={
"save_path": "/path/to/dataset/root",
"robot_meta": {
"robot_type": "dexforce_w1",
"control_freq": 30,
},
"instruction": {
"lang": "pick the cube and place it on the target",
},
"extra": {
"scene_type": "table",
"task_description": "pick_and_place",
"episode_info": {
"rigid_object_physics_attributes": ["mass"],
},
},
"use_videos": False,
},
),
}
```

### Recording Workflow

1. **Initialization**: The Dataset Manager initializes the functor with the configured parameters
2. **Data Collection**: During episode rollout, the functor receives observations and actions
3. **Save Trigger**: When an episode completes, call the functor with `mode="save"`
4. **Finalization**: After all episodes, call `finalize()` to save any remaining data

```python
# Inside environment loop
if episode_done:
dataset_manager.apply(mode="save", env_ids=completed_env_ids)

# After training completes
dataset_manager.apply(mode="finalize")
```
Comment on lines +100 to +114

## Dataset Manager Modes

The Dataset Manager supports the following modes:

- ``save``: Save completed episodes for specified environment IDs
- ``finalize``: Finalize the dataset and save any remaining data

Comment on lines +116 to +122
See {class}`~managers.dataset_manager.DatasetManager` for more details.
9 changes: 6 additions & 3 deletions docs/source/overview/gym/env.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ For a complete list of available observation functors, please refer to {doc}`obs

### Dataset Manager

For Imitation Learning (IL) tasks, the Dataset Manager automates data collection through dataset functors. It currently supports:
For Imitation Learning (IL) tasks, the Dataset Manager automates data collection through dataset functors. For a complete list of available dataset functors and their parameters, please refer to {doc}`dataset_functors`. It currently supports:

* **LeRobot Format** (via {class}`~envs.managers.datasets.LeRobotRecorder`):
Standard format for LeRobot training pipelines. Includes support for task instructions, robot metadata, success flags, and optional video recording.
Expand All @@ -191,7 +191,7 @@ The dataset manager is called automatically during {meth}`~envs.Env.step()`, ens

For RL tasks, EmbodiChain uses the **Action Manager** integrated into {class}`~envs.EmbodiedEnv`:

* **Action Preprocessing**: Configurable via ``actions`` in {class}`~envs.EmbodiedEnvCfg`. Supports DeltaQposTerm, QposTerm, QposNormalizedTerm, EefPoseTerm, QvelTerm, QfTerm.
* **Action Preprocessing**: Configurable via ``actions`` in {class}`~envs.EmbodiedEnvCfg`. Supports DeltaQposTerm, QposTerm, QposNormalizedTerm, EefPoseTerm, QvelTerm, QfTerm. For a complete list of available action terms, please refer to {doc}`action_functors`.
* **Standardized Info Structure**: {class}`~envs.EmbodiedEnv` provides ``compute_task_state``, ``get_info``, and ``evaluate`` for task-specific success/failure and metrics.
* **Episode Management**: Configurable episode length and truncation logic.

Expand Down Expand Up @@ -256,7 +256,7 @@ class MyRLTaskEnv(EmbodiedEnv):
return is_success, is_fail, metrics
```

Configure rewards through the {class}`~envs.managers.RewardManager` in your environment config rather than overriding ``get_reward``.
Configure rewards through the {class}`~envs.managers.RewardManager` in your environment config rather than overriding ``get_reward``. For a complete list of available reward functors, please refer to {doc}`reward_functors`.

### For Imitation Learning Tasks

Expand Down Expand Up @@ -301,4 +301,7 @@ For a complete example of a modular environment setup, please refer to the {ref}

event_functors.md
observation_functors.md
reward_functors.md
action_functors.md
dataset_functors.md
```
Loading
Loading