Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
sudharsan13296 authored Nov 29, 2018
1 parent ac79f29 commit ce43f40
Showing 1 changed file with 79 additions and 79 deletions.
158 changes: 79 additions & 79 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,23 +23,23 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA

* [1.1. What is Reinforcement Learning?](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/01.%20Introduction%20to%20Reinforcement%20Learning/1.1%20What%20is%20Reinforcement%20Learning.ipynb)
* 1.2. Reinforcement Learning Cycle
* [1.3. How RL differs from other ML Paradigms?](#)
* [1.4. Elements of Reinforcement Learning](#)
* [1.5. Agent Environment Interface](#)
* [1.6. Types of RL Environments](#)
* [1.7. Reinforcement Learning Platforms](#)
* [1.8. Applications of Reinforcement Learning](#)
* 1.3. How RL differs from other ML Paradigms?
* 1.4. Elements of Reinforcement Learning
* 1.5. Agent Environment Interface
* 1.6. Types of RL Environments
* 1.7. Reinforcement Learning Platforms
* 1.8. Applications of Reinforcement Learning



### [2. Getting Started with OpenAI and Tensorflow](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow)

* [2.1. Setting Up Your Machine ](#)
* [2.2. Installing Anaconda](#)
* [2.3. Installing Docker](#)
* [2.4. Installing OpenAI Gym and Universe ](#)
* [2.5. Common Error Fixes](#)
* [2.6. OpenAI Gym ](#)
* 2.1. Setting Up Your Machine
* 2.2. Installing Anaconda
* 2.3. Installing Docker
* 2.4. Installing OpenAI Gym and Universe
* 2.5. Common Error Fixes
* 2.6. OpenAI Gym
* [2.7. Basic Simulations](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.07%20Basic%20Simulations.ipynb)
* [2.8. Training a Robot to walk ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.08%20Training%20an%20Robot%20to%20Walk.ipynb)
* [2.9. Building a Video Game Bot](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/02.%20Getting%20Started%20with%20OpenAI%20and%20Tensorflow/2.09%20Building%20a%20Video%20Game%20Bot%20.ipynb)
Expand All @@ -50,121 +50,121 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA
### [3. Markov Decision Process and Dynamic Programming](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming)


* [3.1. Markov Chain and Markov Process ](#)
* [3.2. Markov Decision Process ](#)
* [3.3. Rewards and Returns ](#)
* [3.4. Episodic and Continous Tasks](#)
* [3.5. Policy Function ](#)
* [3.6. State Value Function](#)
* [3.7. State-Action Value Function (Q Function) ](#)
* [3.8. Bellman Equation and Optimality ](#)
* [3.9. Deriving Bellman Equation for Value and Q functions](#)
* [3.10. Solving the Bellman Equation ](#)
* [3.11. Dynamic Programming ](#)
* 3.1. Markov Chain and Markov Process
* 3.2. Markov Decision Process
* 3.3. Rewards and Returns
* 3.4. Episodic and Continous Tasks
* 3.5. Policy Function
* 3.6. State Value Function
* 3.7. State-Action Value Function (Q Function)
* 3.8. Bellman Equation and Optimality
* 3.9. Deriving Bellman Equation for Value and Q functions
* 3.10. Solving the Bellman Equation
* 3.11. Dynamic Programming
* [3.12. Solving Frozen Lake Problem using Value Iteration](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming/3.12%20Value%20Iteration%20-%20Frozen%20Lake%20Problem.ipynb)
* [3.13. Solving Frozen Lake Problem using Policy Iteration](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/03.%20Markov%20Decision%20Process%20and%20Dynamic%20Programming/3.13%20Policy%20Iteration%20-%20Frozen%20Lake%20Problem.ipynb)


### [4. Gaming with Monte Carlo Methods](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods)

* [4.1. Monte Carlo Methods](#)
* 4.1. Monte Carlo Methods
* [4.2. Estimating Value of Pi Using Monte Carlo](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods/4.2%20Estimating%20Value%20of%20Pi%20using%20Monte%20Carlo.ipynb)
* [4.3. Monte Carlo Prediction](#)
* [4.4. First visit Monte Carlo](#)
* [4.5. Every visit Monte Carlo](#)
* 4.3. Monte Carlo Prediction
* 4.4. First visit Monte Carlo
* 4.5. Every visit Monte Carlo
* [4.6. BlackJack with Monte Carlo](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/04.%20Gaming%20with%20Monte%20Carlo%20Methods/4.6%20BlackJack%20with%20First%20visit%20MC.ipynb)
* [4.7. Monte Carlo Control](#)
* [4.8. Monte Carlo Exploration Starts](#)
* [4.9. On Policy Monte Carlo Control](#)
* [4.10. Off Policy Monte Carlo Control](#)
* 4.7. Monte Carlo Control](#)
* 4.8. Monte Carlo Exploration Starts
* 4.9. On Policy Monte Carlo Control
* 4.10. Off Policy Monte Carlo Control


### [5. Temporal Difference Learning](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/05.%20Temporal%20Difference%20Learning)


* [5.1. Temporal Difference Learning](#)
* [5.2. TD Prediction](#)
* [5.3. TD Control](#)
* [5.4. Q Learning](#)
* 5.1. Temporal Difference Learning
* 5.2. TD Prediction
* 5.3. TD Control
* 5.4. Q Learning
* [5.5. Solving the Taxi Problem using Q learning](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/5.%20Temporal%20Difference%20Learning/05.5%20Taxi%20Problem%20-%20Q%20Learning.ipynb)
* [5.6. SARSA](#)
* 5.6. SARSA
* [5.7. Solving the Taxi Problem using SARSA](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/5.%20Temporal%20Difference%20Learning/05.7%20Taxi%20Problem%20-%20SARSA.ipynb)
* [5.8. Difference Between Q learning and SARSA](#)
* 5.8. Difference Between Q learning and SARSA


### [6. Multi-Armed Bandit Problem](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/06.%20Multi-Armed%20Bandit%20Problem)


* [6.1. Multi-armed Bandit Problem](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/06.%20Multi-Armed%20Bandit%20Problem/6.1%20MAB%20-%20Various%20Exploration%20Strategies.ipynb)
* [6.2. Epsilon-Greedy Algorithm](#)
* [6.3. Softmax Exploration Algorithm](#)
* [6.4. Upper Confidence Bound Algorithm](#)
* [6.5. Thompson Sampling Algorithm](#)
* [6.6. Applications of MAB](#)
* 6.2. Epsilon-Greedy Algorithm
* 6.3. Softmax Exploration Algorithm
* 6.4. Upper Confidence Bound Algorithm
* 6.5. Thompson Sampling Algorithm
* 6.6. Applications of MAB
* [6.7. Identifying Right Advertisement Banner Using MAB](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/06.%20Multi-Armed%20Bandit%20Problem/6.7%20Identifying%20Right%20AD%20Banner%20Using%20MAB.ipynb)
* [6.8. Contextual Bandits](#)
* 6.8. Contextual Bandits


### [7. Deep Learning Fundamentals](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/07.%20Deep%20Learning%20Fundamentals)

* [7.1. Artificial Neurons](#)
* [7.2. Artificial Neural Network](#)
* [7.3. Activation Functions](#)
* [7.4. Deep Dive into ANN](#)
* [7.5. Gradient Descent](#)
* 7.1. Artificial Neurons
* 7.2. Artificial Neural Network
* 7.3. Activation Functions
* 7.4. Deep Dive into ANN
* 7.5. Gradient Descent
* [7.6. Neural Networks in Tensorflow](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.6%20Neural%20Network%20Using%20Tensorflow.ipynb)
* [7.7. Recurrent Neural Network](#)
* [7.8. Backpropagation Through Time](#)
* [7.9. Long Short Term Memory RNN](#)
* 7.7. Recurrent Neural Network
* 7.8. Backpropagation Through Time
* 7.9. Long Short Term Memory RNN
* [7.10. Generating Song Lyrics using LSTM RNN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.10%20Generating%20Song%20Lyrics%20Using%20LSTM%20RNN.ipynb)
* [7.11. Convolutional Neural Networks](#)
* [7.12. CNN Architecture](#)
* 7.11. Convolutional Neural Networks
* 7.12. CNN Architecture
* [7.13. Classifying Fashion Products Using CNN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/07.%20Deep%20Learning%20Fundamentals/7.13%20Classifying%20Fashion%20Products%20Using%20CNN.ipynb)


### [8. Atari Games With Deep Q Network](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/08.%20Atari%20Games%20with%20DQN)

* [8.1. What is Deep Q network?](#)
* [8.2. Architecture of DQN](#)
* [8.3. Convolutional Network](#)
* [8.4. Experience Replay](#)
* [8.5. Target Network](#)
* [8.6. Clipping Rewards](#)
* [8.7. DQN Algorithm](#)
* 8.1. What is Deep Q network
* 8.2. Architecture of DQN
* 8.3. Convolutional Network
* 8.4. Experience Replay
* 8.5. Target Network
* 8.6. Clipping Rewards
* 8.7. DQN Algorithm
* [8.8. Building an Agent to Play Atari Games](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/08.%20Atari%20Games%20with%20DQN/8.8%20Building%20an%20Agent%20to%20Play%20Atari%20Games.ipynb)
* [8.9. Double DQN](#)
* [8.10. Dueling Architecture](#)
* 8.9. Double DQN
* 8.10. Dueling Architecture


### [9. Playing Doom With Deep Recurrent Q Network ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/09.%20Playing%20Doom%20Game%20using%20DRQN)

* [9.1. Deep Recurrent Q Network](#)
* [9.2. Partially Observable MDP](#)
* [9.3. Architecture of DRQN](#)
* 9.1. Deep Recurrent Q Network
* 9.2. Partially Observable MDP
* 9.3. Architecture of DRQN
* [9.4. Basic Doom Game](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/09.%20Playing%20Doom%20Game%20using%20DRQN/9.4%20Basic%20Doom%20Game.ipynb)
* [9.5. Build an Agent to Play Doom Game using DRQN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/09.%20Playing%20Doom%20Game%20using%20DRQN/9.5%20Doom%20Game%20Using%20DRQN.ipynb)
* [9.6. Deep Attention Recurrent Q Network](#)
* 9.6. Deep Attention Recurrent Q Network


### [10. Asynchronous Advantage Actor Critic Network ](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/10.%20Aysnchronous%20Advantage%20Actor%20Critic%20Network)

* [10.1. Asynchronous Actor Critic Algorithm](#)
* [10.2. The three A's](#)
* [10.3. Architecture of A3C](#)
* [10.4. Working of A3C](#)
* 10.1. Asynchronous Actor Critic Algorithm
* 10.2. The three A's
* 10.3. Architecture of A3C
* 10.4. Working of A3C
* [10.5. Drive up the Mountain with A3C](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/10.%20Aysnchronous%20Advantage%20Actor%20Critic%20Network/10.5%20Drive%20up%20the%20Mountain%20Using%20A3C.ipynb)
* [10.6. Visualization in Tensorboard](#)
* 10.6. Visualization in Tensorboard



### [11. Policy Gradients and Optimization](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/11.%20Policy%20Gradients%20and%20Optimization)

* [11.1. Policy Gradient ](#)
* 11.1. Policy Gradient
* [11.2. Lunar Lander Using Policy Gradient](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/11.%20Policy%20Gradients%20and%20Optimization/11.2%20Lunar%20Lander%20Using%20Policy%20Gradients.ipynb)
* [11.3. Deep Deterministic Policy Gradient ](#)
* 11.3. Deep Deterministic Policy Gradient
* [11.4. Swinging up the Pendulum using DDPG](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/11.%20Policy%20Gradients%20and%20Optimization/11.3%20Swinging%20Up%20the%20Pendulum%20Using%20DDPG.ipynb)
* [11.5. Trust Region Policy Optimization](#)
* [11.6. Proximal Policy Optimization](#)
* 11.5. Trust Region Policy Optimizatio
* 11.6. Proximal Policy Optimization

### [12. Capstone Project: Car Racing using DQN](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/12.%20Capstone%20Project:%20Car%20Racing%20using%20DQN)

Expand All @@ -177,9 +177,9 @@ The book starts with an introduction to Reinforcement Learning followed by OpenA

### [13. Recent Advancements and Next Steps](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/tree/master/13.%20Recent%20Advancements%20and%20Next%20Steps)

* [13.1. Imagination Augmented Agents](#)
* [13.2. Learning From Human Preference](#)
* 13.1. Imagination Augmented Agents
* 13.2. Learning From Human Preference
* [13.3. Deep Q Learning From Demonstrations](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/13.%20Recent%20Advancements%20and%20Next%20Steps/13.3%20Deep%20Q%20Learning%20From%20Demonstrations.ipynb)
* [13.4. Hindsight Experience Replay](https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python/blob/master/13.%20Recent%20Advancements%20and%20Next%20Steps/13.4%20Hindsight%20Experience%20Replay.ipynb)
* [13.5. Hierarchical Reinforcement Learning](#)
* [13.6. Inverse Reinforcement Learning](#)
* 13.5. Hierarchical Reinforcement Learning
* 13.6. Inverse Reinforcement Learning

0 comments on commit ce43f40

Please sign in to comment.