Skip to content

Commit

Permalink
Intro and MDP chapters
Browse files Browse the repository at this point in the history
  • Loading branch information
dennybritz committed Aug 1, 2016
1 parent 48594d9 commit 98382c5
Show file tree
Hide file tree
Showing 3 changed files with 66 additions and 9 deletions.
25 changes: 25 additions & 0 deletions Introduction/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
## Introduction

### Learning Goals

- Understand the Reinforcement Learning problem and how it differs from Supervised Learning
- Understand what MDPs (Markov Decision Processes) are and how to interpret transition diagrams
- Understand Value Functions, Action-Value Functions, and Policy Functions
- Understand the Bellman Equations and Bellman Optimiality Equations for value functions and action-value functions

### Lectures & Readings

**Required:**

- David Silver's RL Course Lecture 1 - Introduction to Reinforcement Learning ([video](https://www.youtube.com/watch?v=2pWv7GOvuf0), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/intro_RL.pdf))
- David Silver's RL Course Lecture 2 - Markov Decision Processes ([video](https://www.youtube.com/watch?v=lfHX2hHRMVQ), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MDP.pdf))

**Optional:**

- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf) - Chapter 1: The Reinforcement Learning Problem
- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf) - Chapter 3: Finite Markov Decision Processes


### Exercises

TODO
23 changes: 23 additions & 0 deletions MDP/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
## MDPs and Bellman Equations

### Learning Goals

- Understand what MDPs (Markov Decision Processes) are and how to interpret transition diagrams
- Understand Value Functions, Action-Value Functions, and Policy Functions
- Understand the Bellman Equations and Bellman Optimiality Equations for value functions and action-value functions


### Lectures & Readings

**Required:**

- David Silver's RL Course Lecture 2 - Markov Decision Processes ([video](https://www.youtube.com/watch?v=lfHX2hHRMVQ), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MDP.pdf))

**Optional:**

- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf) - Chapter 3: Finite Markov Decision Processes


### Exercises

TODO
27 changes: 18 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,40 @@
#### Overview
#### Overview and Structure

The goal for this repository is to become a comprehensive tutorial of Reinforcement Learning techniques. The focus is on practical applications and code examples. This does not mean that theory will be ignored completely, just that there will be fewer formal proofs than you may find a typical university course.
The goal for this repository is to become a comprehensive tutorial of Reinforcement Learning techniques. The focus is on practical applications and code examples. This does not mean that theory is completely ignored, just that there will be fewer formal proofs and more code examples than you may find in a typical university course.

All code is written in Python 3 and the RL environments are taken from [OpenAI Gym](https://gym.openai.com/). Advanced techniques use [Tensorflow](tensorflow.org/) for neural network implementations.
Whenever possible, this tutorial references outside learning materials to introduce new concepts. These resources are usually from:

- [David Silver's Reinforcement Learning Course](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html)
- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf)
- [Reinforcement Learning at Georgia Tech (CS 8803)](https://www.udacity.com/course/reinforcement-learning--ud600)
- Various Research papers

All code is written in Python 3 and the RL environments are taken from [OpenAI Gym](https://gym.openai.com/). Advanced techniques use [Tensorflow](tensorflow.org/) for neural network implementations.


#### Contents

- Introduction to MDPs, RL problems and OpenAI gym
- Model-Based Reinforcement Learning, Bellman Equation, and exact solutions
- [Introduction to RL problems, OpenAI gym](Introduction/)
- [MDPs and Bellman Equations](MDP/)
- Model-Based RL: Policy and Value Iteration using Dynamic Programming
- Model-Free Prediction & Control (MC, TD)
- Model-Free Prediction & Control with Function Approximation
- Deep Q Learning
- Policy Gradient Methods
- Asynchronous RL Methods (A3C)
- Policy Gradient Methods with Function Approximation
- Asynchronous Policy Gradient Methods (A3C)
- Learning and Planning



#### References

Classes / Projects:
Classes:

- [David Silver's Reinforcement Learning Course](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html)
- [Reinforcement Learning: An Introduction](https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html)
- [Reinforcement Learning at Georgia Tech (CS 8803)](https://www.udacity.com/course/reinforcement-learning--ud600)

Projects:

- [carpedm20/deep-rl-tensorflow](https://github.com/carpedm20/deep-rl-tensorflow)

Papers
Expand Down

0 comments on commit 98382c5

Please sign in to comment.