Skip to content
View PeterSH6's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report PeterSH6

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A visuailzation tool to make deep understaning and easier debugging for RLHF training.

Python 152 6 Updated Feb 20, 2025

The Arcade Learning Environment (ALE) -- a platform for AI research.

C++ 2,234 436 Updated Feb 15, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,251 718 Updated Feb 23, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,559 1,361 Updated Feb 1, 2025

Sky-T1: Train your own O1 preview model within $450

Python 2,936 304 Updated Feb 21, 2025

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 1,952 117 Updated Feb 23, 2025

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 221 13 Updated Jan 13, 2025

A Massively Parallel Large Scale Self-Play Framework

Python 330 33 Updated Jan 9, 2023

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 430 32 Updated Feb 19, 2025
Python 1 Updated Nov 15, 2022

A generative world for general-purpose robotics & embodied AI learning.

Python 24,015 2,066 Updated Feb 23, 2025

A curated list of awesome self-hosted GitHub Action runners in a large comparison matrix

SCSS 763 43 Updated Feb 7, 2025

(JAIR'2022) A mini-scale reproduction code of the AlphaStar program. Note: the original AlphaStar is the AI proposed by DeepMind to play StarCraft II. JAIR = Journal of Artificial Intelligence Rese…

Python 328 58 Updated Nov 9, 2022

A flexible and efficient training framework for large-scale alignment tasks

Python 306 23 Updated Feb 14, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 35,588 6,027 Updated Feb 23, 2025

AI demo for playing ARPG/Soul-like game with RL frame

Python 298 56 Updated Sep 24, 2024

A novel parallel UCT algorithm with linear speedup and negligible performance loss.

Python 115 24 Updated Apr 26, 2021
Python 893 104 Updated Jan 23, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 3,648 324 Updated Feb 23, 2025
Python 126 10 Updated Feb 7, 2025

An elegant PyTorch deep reinforcement learning library.

Python 8,208 1,128 Updated Feb 22, 2025

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

215 11 Updated Dec 7, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,615 161 Updated Feb 23, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,487 361 Updated Feb 22, 2025

Efficient Triton Kernels for LLM Training

Python 4,469 271 Updated Feb 22, 2025

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Python 5,891 346 Updated Jul 21, 2024

Code for the paper "Training Diffusion Models with Reinforcement Learning"

Python 393 26 Updated Jul 5, 2023
Python 2,521 310 Updated May 19, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 10,572 1,036 Updated Feb 23, 2025

A lightweight library for portable low-level GPU computation using WebGPU.

C++ 3,822 185 Updated Feb 21, 2025
Next