Skip to content
View yang0110's full-sized avatar
  • University College London
  • London, UK
  • 09:34 (UTC -12:00)

Block or report yang0110

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

verl: Volcano Engine Reinforcement Learning for LLMs

Python 4,057 369 Updated Mar 2, 2025
Python 380 9 Updated Dec 5, 2024

Official implementation of "Self-Improving Video Generation"

Python 60 2 Updated Dec 26, 2024

Aligning pretrained language models with instruction data generated by themselves.

Python 4,291 503 Updated Mar 27, 2023

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,188 518 Updated Mar 2, 2025
Python 50 5 Updated Jul 28, 2024

A curated list of reinforcement learning with human feedback resources (continually updated)

3,765 232 Updated Feb 19, 2025

Experiments with reinforcement learning and recurrent neural networks

Python 113 16 Updated Oct 27, 2023

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

Python 884 139 Updated Dec 20, 2023

Single player Alpha Zero implementation

Python 42 20 Updated Mar 7, 2022

AlphaZero for singleplayer environments implemented efficiently using Ray

Python 9 2 Updated Apr 4, 2023

Implementation of Dreamer v3 in pytorch.

Python 496 111 Updated Sep 27, 2024
Jupyter Notebook 2 Updated Jul 14, 2024

A PyTorch implementation of DeepMind's AlphaZero agent to play Go and Gomoku board games

Python 112 23 Updated Oct 26, 2024
Python 40 12 Updated Oct 21, 2022

Douzero with ResNet and GPU support for Windows

Python 38 17 Updated Dec 23, 2021
TypeScript 3 1 Updated May 11, 2024

Code for "Unsupervised Zero-Shot RL via Functional Reward Representations"

Python 54 2 Updated Mar 26, 2024

Flappy Bird as a Farama Gymnasium environment.

Python 29 6 Updated Aug 1, 2023

Awesome Game AI materials of Multi-Agent Reinforcement Learning

832 103 Updated Jun 26, 2024

use AI to play some games.

Python 595 189 Updated Mar 24, 2023

Plug-and-play hydra sweepers for the EA-based multifidelity method DEHB and several population-based training variations, all proven to efficiently tune RL hyperparameters.

Python 74 13 Updated Nov 27, 2023

Open-source simulator for autonomous driving research.

C++ 12,086 3,887 Updated Feb 28, 2025

This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the fact that not all levels are equally useful for agents to le…

Python 85 16 Updated Jun 11, 2021

Code for our NeurIPS 2020 paper Improving Generalization in Reinforcement Learning with Mixture Regularization

Shell 32 9 Updated Oct 22, 2020

Centralized place holding any AWS tools

Jupyter Notebook 5 1 Updated Feb 10, 2025

allRank is a framework for training learning-to-rank neural models based on PyTorch.

Python 913 121 Updated Aug 6, 2024

Applying "Stabilizing Transformers for Reinforcement Learning" in Minecraft pig chase (Nov 2021)

Python 5 2 Updated Nov 8, 2023

Scale-Out Computing on AWS is a solution that helps customers deploy and operate a multiuser environment for computationally intensive workflows.

Python 124 58 Updated Feb 27, 2025
Next