Skip to content

Latest commit

 

History

History
76 lines (44 loc) · 2.03 KB

README.md

File metadata and controls

76 lines (44 loc) · 2.03 KB

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

model

This implementation contains:

  1. Deep Q-network and Q-learning
  2. Experience replay memory
    • to reduce the correlations between consecutive updates
  3. Network for Q-learnig targets are fixed for intervals
    • to reduce the correlations between target and predicted Q-values

Requirements

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

best

Training details

Details of Breakout with model m2(red) for 18 hours using GTX 980 Ti.

(episode/min reward should be episode/average reward. typo)

  1. Statistics of loss, q values, rewards and # of game / episode tensorboard
  2. Histogram of rewards / episode tensorboard

Details of Breakout with model m1(green), m2(purple), m3(blue) and m4(red) for 15 hours using GTX 980 Ti.

  1. Statistics of loss, q values, rewards and # of game / episode tensorboard
  2. Histogram of rewards / episode tensorboard

References

License

MIT License.