Stars
2019新型冠状病毒疫情时间序列数据仓库
Author's PyTorch implementation of BCQ for continuous and discrete actions
PyTorch implementation of the OpenAI Gym's CartPole environment using filtered training batches that give a higher overall reward than specified percentile.
Policy gradient reinforcement learning algorithm with importance sampling
Code for experiments regarding importance sampling for training neural networks
Exploring algorithms in the domain of offline reinforcement learning (REM, Ensemble-DQN, DQN, ...)
Monte-Carlo Tree Search for board games and for the deterministic OpenAI Gym environments