Last updated: 2020-04-25
We’ve collected some resources for getting started in reinforcement learning (RL). Comments or suggestions are welcome!
Unlike natural language and vision, where the building blocks for modeling are neural networks, and advances in the field have gone hand-in-hand with advances in neural network architecture, a significant part of modeling in RL takes place outside of neural networks, with nets being just one component. We therefore recommend a bottoms up approach to learning RL, starting with classic RL theory before proceeding to deep RL.
Roughly in order of recommended progression:
Quick surveys of RL
Before embarking on a full course of study, get some high-level orientation from these lively blog posts:
- Deep reinforcement learning doesn’t work yet, by Alex Irpan - good look at current problems in deep RL, many entertaining examples (doesn’t go into details of algorithms)
- Pong from pixels, by Andrej Karpathy - goes into details of implementing policy gradient algorithm on the pong Atari game to illustrate RL in action, accompanied by sweet, simple code
Do in parallel:
- David Silver’s UCL course on RL video lectures - condenses important concepts from Sutton and Barto while maintaining continuity, clear and insightful explanations
- Introduction to Reinforcement Learning, Sutton and Barto’s (S&B) classic text, free online copy available
- David Silver’s course closely follows S&B up through about lecture 5. We recommend watching each video lecture first, then reviewing the corresponding material in the text afterwards.
- Function approximation, where neural networks (deep RL) become relevant, is not until lecture 6. Silver’s course doesn’t include links to exercises, but we’ve provided code for an example model from his first few lectures (discussed in our blog post introducing RL and solving the problem with dynamic programming).
Additional video lectures
Sergey Levine’s Berkeley Deep RL course, CS285
- Levine’s course is more advanced than Silver’s, with a brisker approach to deep RL (less time spent on classical RL basics). Like Silver, Levine is a fantastic lecturer and provides a valuable complementary perspective, e.g. more discussion about convergence properties of algorithms, additional intuition on why policy gradients are high variance. We enjoyed interleaving the Silver and Levine lectures to get multiple takes on the same topics.
- The content from the first half of Levine’s course overlaps with Silver’s, while the second half of the course moves beyond core concepts and brings you to the cutting edge in RL.
Exercises and implementations
- Denny Britz’s reinforcement learning repo - instructional exercises and self-contained implementations. A great accompaniment to David Silver’s course and Sutton and Barto.
- OpenAI’s Spinning up in RL - self-contained, lightweight implementations of different RL algorithms
- OpenAI gym - a toolkit for creating custom environments for running your RL algorithms, including a good number of ready-to-go implementations. The API is simple and intuitive.
- Openreview.net - it’s hard to place the impact/context of the latest research when you’re just coming to a field, so the publicly available feedback from experts reviewing the papers is invaluable
- Benchmarks and baselines - not sure if your implementation of an RL algorithm is performing as expected? Check out the rl-baselines-zoo benchmarks, achieved from standardized implementations of RL algorithms using Stable Baselines.