A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each other, and investigate reliability of learned MuZero MDP models.

Jupyter Notebook 156 25 Updated Mar 28, 2021

artemyk / ibsgd

Jupyter Notebook 147 50 Updated Apr 20, 2020

BY571 / DQN-Atari-Agents

DQN-Atari-Agents: Modularized & Parallel PyTorch implementation of several DQN Agents, i.a. DDQN, Dueling DQN, Noisy DQN, C51, Rainbow, and DRQN

Jupyter Notebook 120 14 Updated Dec 18, 2020

alexalemi / vib_demo

Jupyter Notebook 87 22 Updated Jan 25, 2022

Shmuma / rl

RL experiments

Jupyter Notebook 69 34 Updated Nov 21, 2022

younggyoseo / RE3

RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

Jupyter Notebook 67 8 Updated Jul 29, 2021

james-simon / eigenlearning

codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"

Jupyter Notebook 49 8 Updated May 2, 2023

openai / ppo-ewma

Code for the paper "Batch size invariance for policy optimization"

Jupyter Notebook 46 16 Updated Apr 2, 2023

jetnew / SlimeRL

Code repository for the research project "You Play Ball, I Play Ball: Bayesian Multi-Agent Reinforcement Learning for Slime Volleyball", won 1st Prize at 17th STePS.

Jupyter Notebook 16 4 Updated Nov 15, 2020

marcbrittain / Prioritized-Sequence-Experience-Replay

Forked from google/dopamine

Prioritized Sequence Experience Replay

Jupyter Notebook 10 3 Updated Aug 16, 2021

AdrienCourtois / OptimalRepresentationRL

An implementation in PyTorch of the paper "A Geometric Perspective on Optimal Representations for Reinforcement Learning" by Bellemare et al

Jupyter Notebook 8 1 Updated Jan 16, 2020

DRL-CASIA / Collected-Reinforcement-Learning

Forked from dennybritz/reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

Jupyter Notebook 5 3 Updated Sep 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LARS12llt

Block or report LARS12llt

Stars

google-research / google-research

yandexdataschool / Practical_RL

suragnair / alpha-zero-general

google / brax

jmtomczak / intro_dgm

rll / deepul

ikostrikov / jaxrl

qiguming / MLAPP_CN_CODE

denisyarats / pytorch_sac

MishaLaskin / rad

rajatvd / NTK

kaesve / muzero

artemyk / ibsgd

BY571 / DQN-Atari-Agents

alexalemi / vib_demo

Shmuma / rl

younggyoseo / RE3

james-simon / eigenlearning

openai / ppo-ewma

jetnew / SlimeRL

marcbrittain / Prioritized-Sequence-Experience-Replay

AdrienCourtois / OptimalRepresentationRL

DRL-CASIA / Collected-Reinforcement-Learning

rlgammazero / mvarl_hands_on

davidbrandfonbrener / geometric_insights_into_TD_learning

kradongit / cs182_final_project

dongsubkim / CS182_HW4

ADMoreau / curl-att

joonleesky / tensorflow-tutorial