Skip to content

yuchen-x/ROLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ROLA: Robust Local Advantage Actor-Critic

This is the code for implementing the ROLA algorithm presented in the paper: Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning, IEEE The 3rd International Symposium on Multi-Robot and Multi-Agent Systems (MRS), 2021. Best Paper Award Finalist.

Minimum Requirements

  • Python 3.7.5
  • Pytorch 1.4.0
  • OpenAI Gym 0.15.4
  • Sacred 0.8.1

Installation

  • To reproduce the results presented in the paper, all dependecies can be installed by running
cd ROLA/Anaconda_Env/
conda env create -f mrs21.yml
  • To install all modules of the algorithm and domains
cd ROLA
pip install -e .

Core Hyper-Parameter

  • a_lr, learning rate for actor updates (default: 1e-3)
  • c_lr, learning rate for critic updates (default: 1e-3)
  • local_c_train_iteration, the number of training iterations for updating local critic (default: 4)
  • eps_start, the initial value for epsilon decay (default: 1.0)
  • eps_end, the ending value for epsilon decay (default: 0.01)
  • eps_decay_epis, the period of epslion decay
  • train_freq, perform trainning very # of episodes (default: 2)
  • n_envs, the number of parallel envs (default: equal to the train_freq)
  • c_target_update_freq, update the target-net of the critics every # of episodes (default: 16)
  • n_step_bootstrap, n-step TD (default: 3)
  • grad_clip_norm, gradient clipping (default: 1.0)

How to Run

  • Capture Target Domain (Grid World 6x6)
rola.py with env_name='CT' n_agent=2 grid_dim=[6,6] n_envs=2 max_epi_steps=60 a_lr=0.0005 c_lr=0.0005 local_c_train_iteration=1 train_freq=2 c_target_update_freq=16 n_step_bootstrap=3 eps_decay_epis=15000 eps_end=0.05 total_epies=80000 eval_num_epi=10 eval_freq=100 eval_policy save_ckpt save_dir='ROLA_CT_6' run_idx=0 
  • Small Box Pushing Domain (Grid World 6x6)
rola.py with env_name='SBP' n_agent=2 grid_dim=[6,6] n_envs=2 max_epi_steps=100 small_box_reward=100 gamma=0.98 small_box_only terminal_reward_only a_lr=0.001 c_lr=0.003 local_c_train_iteration=4 train_freq=2 c_target_update_freq=32 n_step_bootstrap=3 eps_decay_epis=2000 eps_end=0.01 grad_clip_norm='None' total_epies=4000 eval_num_epi=10 eval_freq=100 eval_policy save_ckpt save_dir='ROLA_BP_6' run_idx=0
  • Cooperative Navigation (observation radius 1.4)
rola.py with env_name='pomdp_simple_spread' n_agent=3 n_envs=2 max_epi_steps=25 discrete_mul=2 obs_r=1.4 a_lr=0.001 c_lr=0.001 local_c_train_iteration=4 train_freq=2 c_target_update_freq=16 n_step_bootstrap=5 eps_decay_epis=50000 eps_end=0.05 total_epies=100000 eval_num_epi=10 eval_freq=100 eval_policy save_ckpt save_dir='ROLA_CN_6' run_idx=0
  • Antipodal Navigation (observation radius 1.4)
rola.py with env_name='pomdp_advanced_spread' config_name='antipodal' n_agent=4 n_envs=2 max_epi_steps=50 discrete_mul=1 obs_r=1.4 a_lr=0.001 c_lr=0.001 local_c_train_iteration=4 train_freq=2 c_target_update_freq=16 n_step_bootstrap=5 eps_decay_epis=20000 eps_end=0.05 total_epies=30000 eval_num_epi=10 eval_freq=100 eval_policy save_ckpt save_dir='ROLA_AN_6' run_idx=0

NOTE: both training and testing results are saved in the specified directory under the ./performance/ folder.

Code Structure

  • ./scripts/rola.py the main training loop of ROLA
  • ./src/pg_marl/rola/ all source code of ROLA
  • ./src/pg_marl/rola/learner.py the core code for ROLA algorithm
  • ./src/marl_envs/ all source code of the domains considered in the paper

Citation

If you used this code for your research or found it helpful, please consider citing this paper:

@InProceedings{xiao_mrs_2021,
  author = "Xiao, Yuchen and Lyu, Xueguang and Amato, Christopher",
  title = "Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning",
  booktitle = "IEEE The 3rd International Symposium on Multi-Robot and Multi-Agent Systems",
  year = "2021"
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages