...menustart

CS294: Deep Reinforcement Learning, Spring 2017
Week1 : Introduction
Week2 : Supervised learning and decision making (Levine)
- Terminology & notation

...menuend

CS294: Deep Reinforcement Learning, Spring 2017

http://rll.berkeley.edu/deeprlcourse

https://www.reddit.com/r/berkeleydeeprlcourse/

Week1 : Introduction

Why deep reinforcement learning?

Deep = can process complex sensory input
- … and also compute really complex functions
Reinforcement learning = can choose complex actions

What is Reinforcement Learning?

agent interacting with a previously unknown environment, trying to maximize cumulative reward
Formalized as partially observable Markov decision process (POMDP)

Motor Control and Robotics

Robotics:

Observations: camera images, joint angles
Actions: joint torques
Rewards: stay balanced, navigate to target locations, serve and protect humans

Business Operations

Inventory Management

Observations: current inventory levels
Actions: number of units of each item to purchase
Rewards: profit

What Is Deep Reinforcement Learning?

Reinforcement learning using neural networks to approximate functions
- Policies (select next action)
- Value functions ( measure goodness of states or state-action pairs )
- Dynamics Models (predict next states and rewards)
  - try to approximate how the system is going to evolve over time

How Does RL Relate to Other ML Problems?

Reinforcement learning:
- Environment samples input x_{t_{~ P(xt | xt-1, yt-1)}}
  - Environment is stateful: input depends on your previous actions!
  - Agent takes action ŷt = f(x_t₎
- Agent receives cost c_{t_{~ P(ct | xt , ŷt ) where P a probability distribution unknown to the agent.}}

Week2 : Supervised learning and decision making (Levine)

Terminology & notation

x_{t_{: state}}
o_{t_{: observation}}
u_{t_{: action}}
π(u_{θ_{|ot) : policy}}
c : cost function
r : reward
- c = - r