Scallable partially observable and/or non-Markovian gridworld for planning or reinforcement learning
-
Updated
May 18, 2022 - Python
Scallable partially observable and/or non-Markovian gridworld for planning or reinforcement learning
Codebase of ηψ-Learning algorithm that learns a non-Markovian maximum state entropy exploration policy by combining predecessor and successor representation to estimate the state visitation distribution of a trajectory of finite length.
Solving the problem of non-Markovian reward functions by providing agents access to a finite amount of memory
Add a description, image, and links to the non-markovian-rl topic page so that developers can more easily learn about it.
To associate your repository with the non-markovian-rl topic, visit your repo's landing page and select "manage topics."