Federated Averaging Deep Q-Network
A distributed Reinforcement Learning framework
- Install Anaconda
- Create environment
conda create --name py36 python=3.6 - Activate environment (Windows)
conda activate py36 - Activate environment (Ubuntu)
source conda activate py36 - Install Python packages
pip install tensorflow
pip install gym
pip install dill
(conda install -y scipy)
Implement a distributed DQN algorithm according to this specification
cd baselines/deepq/experiments/
python async_fed_avg.py [--config_file config.ini]
[--config DEFAULT]
--job_name "worker"
--task_index 0
[--seed 1]
-
config_fileis the name of the config file (or path if it's in a different directory).- Defaults to
config.ini
- Defaults to
-
configis the section in the config file to override the default values with.- Defaults to
DEFAULT. Useasyncfor the[async]section andsyncfor the[sync]
- Defaults to
-
job_nameis the type of job the current client should perform.- Defaults to
worker. Use onlyworkerorpsfor this value
- Defaults to
-
task_indexis the index of the current server's IP in the list for its job (ps or worker).- Defaults to
0. Worker 0 will become the chief with extra responsibilities
- Defaults to
-
seedis the seed for all the randomness in the server.- Defaults to
1. The server's task index will be added to this to make sure every server has a unique seed
- Defaults to
The asynchronous Cart Pole script is located at
baselines/deepq/experiments/async_fed_avg.py
Most edits will be done there and in the build graph file at
baselines/deepq/build_graph.py