Skip to content

FoConrad/rl_exercise

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exercise on Reinforcement Learning

In this exercise you will be using reinforcement learning to solve the CartPole problem. This exercise will test you on how quickly you can pick up new concepts through self-study.

For an thorough introduction to reinforcement learning you can see the UCL Course on RL, taught by David Silver (a name you will likely see in other RL materials). Accompanying this course is the book Reinforcement Learning: An Introduction. Additionally, as the exercise here will deal deep learning, another resource that may be useful is the Deep RL course at Berkely. Convolutional Neural Networks for Visual Recognition from Stanford is a good introduction to deep learning.

Important note

There may be a lot of new information for you here. Don't get discouraged, this is supposed to be a learning experience as well; it may not come instantly.

Description

The problem you will be solving is to get a cart to balance a pole, without it tipping over. See this link for a depiction of the problem, and information on the environment you will be using.

The algorithm we wish you to use to solve this problem is Deep Q-Learning (note, you will find additional resources below). You will be solving this problem using an OpenAI Gym environment, which is already setup.

The machine learning framework you will be working with to implement the neural networks that you will likely need is MXNet. This framework makes writing neural networks a lot more simple, especially using the new high-level subpackage called Gluon.

You can find a list of the only dependencies you will need to install for Python in the file named "dependencies" above.

In summary

  • Implement an RL agent in the file agent.py, which learns via Deep Q-Learning
  • This agent should implement the methods already defined in it
  • Be sure to have sufficient print statements to show the agent's learning progress
  • Test your implementation using main.py and you should aim to maximize your 'pass' percentage that is output, with at most 1000 episodes run. Keep in mind that the results may vary wildly from one run to another. Also, even with a correct implementation, you may need to spend some time finding the correct hyperparameters, such as learning rate, network size, epsilon-greedy ratio
  • Add a one page write-up to your GitHub repo explaining the progress you made, what you learned, and how you learned
  • Send an email with a GitHub invitation, or link, to your cloned repo to both Chien-Chin and Conrad
  • (Optional) If you finish this, see additional resources below to try to improve your agent and/or use it in harder environments

Resources

Here is an accumulation of the resources listed above, as well as some additional.

Main resources

Optional

If you successfully implemented the Deep Q-Learning agent, and solved the CartPole environment, consider some of the following resources. If you have time you may wish to extend on this exercise by implementing one of the extensions/methods below, and trying it on another environment as well (feel free to turn that code in too!). Looking at the work cited in the papers below is a good way to get additional resources on topics you are not familiar with.

Additional Resources (extensions to Deep Q-Learning and other algorithms)

About

Exercise in simple reinforcement learning environment

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages