Intro and MDP chapters

RoySRobinson · Aug 1, 2016 · 98382c5 · 98382c5
1 parent 48594d9
commit 98382c5
Show file tree

Hide file tree

Showing 3 changed files with 66 additions and 9 deletions.
diff --git a/Introduction/README.md b/Introduction/README.md
@@ -0,0 +1,25 @@
+## Introduction
+
+### Learning Goals
+
+- Understand the Reinforcement Learning problem and how it differs from Supervised Learning
+- Understand what MDPs (Markov Decision Processes) are and how to interpret transition diagrams
+- Understand Value Functions, Action-Value Functions, and Policy Functions
+- Understand the Bellman Equations and Bellman Optimiality Equations for value functions and action-value functions
+
+### Lectures & Readings
+
+**Required:**
+
+- David Silver's RL Course Lecture 1 - Introduction to Reinforcement Learning ([video](https://www.youtube.com/watch?v=2pWv7GOvuf0), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/intro_RL.pdf))
+- David Silver's RL Course Lecture 2 - Markov Decision Processes ([video](https://www.youtube.com/watch?v=lfHX2hHRMVQ), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MDP.pdf))
+
+**Optional:**
+
+- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf) - Chapter 1: The Reinforcement Learning Problem
+- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf) - Chapter 3: Finite Markov Decision Processes
+
+
+### Exercises
+
+TODO
diff --git a/MDP/README.md b/MDP/README.md
@@ -0,0 +1,23 @@
+## MDPs and Bellman Equations
+
+### Learning Goals
+
+- Understand what MDPs (Markov Decision Processes) are and how to interpret transition diagrams
+- Understand Value Functions, Action-Value Functions, and Policy Functions
+- Understand the Bellman Equations and Bellman Optimiality Equations for value functions and action-value functions
+
+
+### Lectures & Readings
+
+**Required:**
+
+- David Silver's RL Course Lecture 2 - Markov Decision Processes ([video](https://www.youtube.com/watch?v=lfHX2hHRMVQ), [slides](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching_files/MDP.pdf))
+
+**Optional:**
+
+- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf) - Chapter 3: Finite Markov Decision Processes
+
+
+### Exercises
+
+TODO
diff --git a/README.md b/README.md
@@ -1,31 +1,40 @@
-#### Overview
+#### Overview and Structure
 
-The goal for this repository is to become a comprehensive tutorial of Reinforcement Learning techniques. The focus is on practical applications and code examples. This does not mean that theory will be ignored completely, just that there will be fewer formal proofs than you may find a typical university course.
+The goal for this repository is to become a comprehensive tutorial of Reinforcement Learning techniques. The focus is on practical applications and code examples. This does not mean that theory is completely ignored, just that there will be fewer formal proofs and more code examples than you may find in a typical university course.
 
-All code is written in Python 3 and the RL environments are taken from [OpenAI Gym](https://gym.openai.com/). Advanced techniques use [Tensorflow](tensorflow.org/) for neural network implementations.
+Whenever possible, this tutorial references outside learning materials to introduce new concepts. These resources are usually from:
 
+- [David Silver's Reinforcement Learning Course](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html)
+- [Reinforcement Learning: An Introduction](https://www.dropbox.com/s/b3psxv2r0ccmf80/book2015oct.pdf)
+- [Reinforcement Learning at Georgia Tech (CS 8803)](https://www.udacity.com/course/reinforcement-learning--ud600)
+- Various Research papers
+
+All code is written in Python 3 and the RL environments are taken from [OpenAI Gym](https://gym.openai.com/). Advanced techniques use [Tensorflow](tensorflow.org/) for neural network implementations.
 
 
 #### Contents
 
-- Introduction to MDPs, RL problems and OpenAI gym
-- Model-Based Reinforcement Learning, Bellman Equation, and exact solutions
+- [Introduction to RL problems, OpenAI gym](Introduction/)
+- [MDPs and Bellman Equations](MDP/)
+- Model-Based RL: Policy and Value Iteration using Dynamic Programming
 - Model-Free Prediction & Control (MC, TD)
 - Model-Free Prediction & Control with Function Approximation
 - Deep Q Learning
 - Policy Gradient Methods
-- Asynchronous RL Methods (A3C)
+- Policy Gradient Methods with Function Approximation
+- Asynchronous Policy Gradient Methods (A3C)
 - Learning and Planning
 
-
-
 #### References
 
-Classes / Projects:
+Classes:
 
 - [David Silver's Reinforcement Learning Course](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html)
 - [Reinforcement Learning: An Introduction](https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html)
 - [Reinforcement Learning at Georgia Tech (CS 8803)](https://www.udacity.com/course/reinforcement-learning--ud600)
+
+Projects:
+
 - [carpedm20/deep-rl-tensorflow](https://github.com/carpedm20/deep-rl-tensorflow)
 
 Papers