Skip to content
This repository has been archived by the owner on Nov 6, 2023. It is now read-only.
/ RNN-RL Public archive

Experiments with reinforcement learning and recurrent neural networks

Notifications You must be signed in to change notification settings

AntoineTheb/RNN-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recurrent Reinforcement Learning in Pytorch

Experiments with reinforcement learning and recurrent neural networks

Disclaimer: My code is very much based on Scott Fujimotos's TD3 implementation TODO: Cite properly

Motivations

This repo serves as a exercise for myself to properly understand what goes into using RNNs with Deep Reinforcement Learning

1: Kapturowski et al. 2019 provides insight on how RL algorithms might use memory while training. For on-policy algorithms such as PPO, it makes sense to train on whole trajectories and discard the RNN's memory. However, could the hidden state at each timestep be kept, and each timestep used as an independant "batch" item ?

For off-policy algorithms, such as DDPG, things get a bit more complicated. The naive option of training on whole trajectories is not computationally desirable, especially if enforcing a specific trajectory length is not an option. Another optiom would be to train on timesteps without using the RNN's memory. However, this implies losing the advantages associated with using RNNs.

An other option would be to keep the hidden state of the RNN associated with each timestep. However, the hidden states will become "outdated" as the timestep stay in memory and the network learns a new internal representation. [1] also suggests allowing the network a "burn-in" period by saving n timesteps and letting the network make it down hidden state before training on the timestep.

Implementations

  • TD3
  • DDPG
  • PPO (WIP)

Requirements

About

Experiments with reinforcement learning and recurrent neural networks

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published