Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight)
The framework is inherited from PyMARL. UPDeT is written in pytorch and uses SMAC as its environment.
pip install -r requirements.txtbash install_sc2.shBefore training your own transformer-based multi-agent model, there are a list of things to note.
- Currently, this repository supports marine-based battle scenarios. e.g.
3m,8m,5m_vs_6m. - If you are interested in training a different unit type, carefully modify the
Transformer Parametersblock atsrc/config/default.yamland revise the_build_input_transformerfunction inbasic_controller.python. - Before running the experiment, check the agent type in
Agent Parametersblock atsrc/config/default.yaml. - This repository contains two new transformer-based agents from the UPDeT paper including
- Standard UPDeT
- Aggregation Transformer
python3 src/main.py --config=vdn --env-config=sc2 with env_args.map_name=5m_vs_6mAll results will be stored in the Results/ folder.
Surpass the GRU baseline on hard 5m_vs_6m with:
- QMIX: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
- VDN: Value-Decomposition Networks For Cooperative Multi-Agent Learning
- QTRAN: QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Zero-shot generalize to different tasks:
- Result on
7m-5m-3mtransfer learning.
Note: Only UPDeT can be deployed to other scenarios without changing the model's architecture.
More details please refer to UPDeT paper.
@article{hu2021updet,
title={UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers},
author={Hu, Siyi and Zhu, Fengda and Chang, Xiaojun and Liang, Xiaodan},
journal={arXiv preprint arXiv:2101.08001},
year={2021}
}The MIT License

