The code repository contains relevant configuration requirements, fundamental skills training, RSG construction, inference and composition code. This repository is based off of Nikita Rudin's legged_gym and AMP repo, and enables us to train policies using Isaac Gym.
- Each env is defined by an env file (
legged_gym/envs/base/legged_robot.py) and a config file (such aslegged_gym/envs/a1/a1_amp_forward_walking_config.py). The config file contains two classes: one conatianing all the environment parameters (LeggedRobotCfg) and one for the training parameters (LeggedRobotCfgPPo). - Both env and config classes use inheritance.
- Each non-zero reward scale specified in
cfgwill add a function with a corresponding name to the list of elements which will be summed to get the total reward. The AMP reward parameters are defined inLeggedRobotCfgPPO, as well as the path to the reference data. - Tasks must be registered using
task_registry.register(name, EnvClass, EnvConfig, TrainConfig). This is done inlegged_gym/envs/__init__.py. - Skill construction code can be found in the
rsg_constructionfolder.
- Create a new python virtual env with python 3.6, 3.7 or 3.8 (3.8 recommended). i.e. with conda:
conda create -n sg python==3.8conda activate sg
- Install pytorch 1.10 with cuda-11.3:
pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 tensorboard==2.8.0 pybullet==3.2.1 opencv-python==4.5.5.64 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
- Install Isaac Gym
- Download and install Isaac Gym Preview 3 (Preview 2 will not work!) from https://developer.nvidia.com/isaac-gym
cd isaacgym/python && pip install -e .- Try running an example
cd examples && python 1080_balls_of_solitude.py - For troubleshooting check docs (
isaacgym/docs/index.html)
- Install rsl_rl (PPO implementation)
- Clone this repository
cd AMP_for_hardware/rsl_rl && pip install -e .
- Install legged_gym
cd ../ && pip install -e .
CUDA_VISIBLE_DEVICES=0 python legged_gym/scripts/train.py --task=a1_amp_forward_walking --actor_critic_class=ActorCritic --terrain_id=16 --num_envs=3000 --max_iterations=5000 --isObservationEstimation --isEnvBaseline --headless
CUDA_VISIBLE_DEVICES: Specify the GPU device on which the program is running.
--task: different task.
--actor_critic_class: utilizing Actor-Critic framework (PPO).
--terrain_id: different terrain.
--num_envs: the number of environments in parallel.
--max_iterations: the number of PPO algorithm iterations.
--isObservationEstimation: context-aided estimator network (CENet) architecture.
--isEnvBaseline: goal-conditional policy.
--headless: Does not display the graphical interface.
The realistic demonstrations data for AMP are available google drive.
CUDA_VISIBLE_DEVICES=0 python legged_gym/scripts/train.py --task=a1_amp_forward_walking --actor_critic_class=ActorCritic --skills_descriptor_id=5 --terrain_id=0 --headless
--skills_descriptor_id: Differential weighting of intrinsic and extrinsic rewards.
Setup python virtualenv and install packages as following:
cd rsg_construction/
python -m venv ./venv
source ./venv/bin/activate
pip install -r requirements.txtThe trained skills and task descriptions are available google drive.
Train and evaluate the built RSG:
cd rsg_construction/
python train.pyand
cd rsg_construction/
python test.pypython legged_gym/scripts/train.py --task=a1_amp_ct_b_sequential_2 --terrain_id=18 --num_envs=1 --max_iterations=200 --isObservationEstimation --case_id=2
python legged_gym/scripts/train.py --task=a1_amp_ct_b_sequential_3 --terrain_id=18 --num_envs=1 --max_iterations=200 --isObservationEstimation --case_id=0
The skill inference is implemented in class BOOnPolicyRunnerSequentialCase1 in file rsl_rl/rsl_rl/runners/sg_on_policy_runner.py.
The skill composition is implemented in class NewCompositeActor in file rsl_rl/rsl_rl/modules/composite_actor_bo.py.
The BO method is implemented in class CompositeBO in file rsl_rl/rsl_rl/algorithms/ppo.py.