Exercise on Reinforcement Learning

In this exercise you will be using reinforcement learning to solve the CartPole problem. This exercise will test you on how quickly you can pick up new concepts through self-study.

For an thorough introduction to reinforcement learning you can see the UCL Course on RL, taught by David Silver (a name you will likely see in other RL materials). Accompanying this course is the book Reinforcement Learning: An Introduction. Additionally, as the exercise here will deal deep learning, another resource that may be useful is the Deep RL course at Berkely. Convolutional Neural Networks for Visual Recognition from Stanford is a good introduction to deep learning.

Important note

There may be a lot of new information for you here. Don't get discouraged, this is supposed to be a learning experience as well; it may not come instantly.

Description

The problem you will be solving is to get a cart to balance a pole, without it tipping over. See this link for a depiction of the problem, and information on the environment you will be using.

The algorithm we wish you to use to solve this problem is Deep Q-Learning (note, you will find additional resources below). You will be solving this problem using an OpenAI Gym environment, which is already setup.

The machine learning framework you will be working with to implement the neural networks that you will likely need is MXNet. This framework makes writing neural networks a lot more simple, especially using the new high-level subpackage called Gluon.

You can find a list of the only dependencies you will need to install for Python in the file named "dependencies" above.

In summary

Implement an RL agent in the file agent.py, which learns via Deep Q-Learning
This agent should implement the methods already defined in it
Be sure to have sufficient print statements to show the agent's learning progress
Test your implementation using main.py and you should aim to maximize your 'pass' percentage that is output, with at most 1000 episodes run. Keep in mind that the results may vary wildly from one run to another. Also, even with a correct implementation, you may need to spend some time finding the correct hyperparameters, such as learning rate, network size, epsilon-greedy ratio
Add a one page write-up to your GitHub repo explaining the progress you made, what you learned, and how you learned
Send an email with a GitHub invitation, or link, to your cloned repo to both Chien-Chin and Conrad
(Optional) If you finish this, see additional resources below to try to improve your agent and/or use it in harder environments

Resources

Here is an accumulation of the resources listed above, as well as some additional.

Main resources

Convolutional Neural Networks for Visual Recognition
Introduction to RL course
Introduction to RL book
Deep RL course
OpenAI Gym
CartPole Gym Environment
MXNet (and Gluon)
Deep Q-Learning Nature
Deep Q-Learning for Atari (trying an Atari game if you solve CartPole would be an idea)

Optional

If you successfully implemented the Deep Q-Learning agent, and solved the CartPole environment, consider some of the following resources. If you have time you may wish to extend on this exercise by implementing one of the extensions/methods below, and trying it on another environment as well (feel free to turn that code in too!). Looking at the work cited in the papers below is a good way to get additional resources on topics you are not familiar with.

Additional Resources (extensions to Deep Q-Learning and other algorithms)

Double Q-Learning
Dueling Networks
Prioritized Replay
Asynchronous Advantage Actor-Critic (A3C) (this is a policy gradient method)
Proximal Policy Optimization (PPO) (also a policy gradient method)

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
dependencies		dependencies
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exercise on Reinforcement Learning

Important note

Description

In summary

Resources

Main resources

Optional

Additional Resources (extensions to Deep Q-Learning and other algorithms)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Exercise on Reinforcement Learning

Important note

Description

In summary

Resources

Main resources

Optional

Additional Resources (extensions to Deep Q-Learning and other algorithms)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages