-
University College London
- London, UK
-
09:34
(UTC -12:00)
Stars
verl: Volcano Engine Reinforcement Learning for LLMs
Official implementation of "Self-Improving Video Generation"
Aligning pretrained language models with instruction data generated by themselves.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
A curated list of reinforcement learning with human feedback resources (continually updated)
Experiments with reinforcement learning and recurrent neural networks
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.
Single player Alpha Zero implementation
AlphaZero for singleplayer environments implemented efficiently using Ray
Implementation of Dreamer v3 in pytorch.
A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games
Douzero with ResNet and GPU support for Windows
Code for "Unsupervised Zero-Shot RL via Functional Reward Representations"
Flappy Bird as a Farama Gymnasium environment.
Awesome Game AI materials of Multi-Agent Reinforcement Learning
Plug-and-play hydra sweepers for the EA-based multifidelity method DEHB and several population-based training variations, all proven to efficiently tune RL hyperparameters.
Open-source simulator for autonomous driving research.
This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the fact that not all levels are equally useful for agents to le…
Code for our NeurIPS 2020 paper Improving Generalization in Reinforcement Learning with Mixture Regularization
allRank is a framework for training learning-to-rank neural models based on PyTorch.
Applying "Stabilizing Transformers for Reinforcement Learning" in Minecraft pig chase (Nov 2021)
Scale-Out Computing on AWS is a solution that helps customers deploy and operate a multiuser environment for computationally intensive workflows.