Skip to content
View LaFeuilleMorte's full-sized avatar

Block or report LaFeuilleMorte

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36,905 4,254 Updated Feb 21, 2025

DeepSeek 系列工作解读、扩展和复现。

Python 519 41 Updated Feb 15, 2025

https://hrl.boyuai.com/

Jupyter Notebook 2,949 593 Updated Nov 22, 2022

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,369 1,038 Updated Feb 20, 2025

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,429 1,743 Updated Feb 19, 2025

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,188 165 Updated Feb 21, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 10,498 1,025 Updated Feb 22, 2025

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 163 10 Updated Feb 6, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 38,890 5,821 Updated Feb 22, 2025

LLM inference in C/C++

C++ 74,990 10,834 Updated Feb 22, 2025

强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/

Jupyter Notebook 10,334 1,946 Updated Feb 20, 2025

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

Python 2,253 533 Updated Feb 14, 2025

Collect every awesome work about r1!

Python 193 5 Updated Feb 20, 2025

Exploring Applications of GRPO

Python 95 9 Updated Feb 16, 2025

Reinforcement learning resources curated

8,975 1,828 Updated May 25, 2023

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Python 4,118 870 Updated Mar 24, 2023

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 31,475 2,099 Updated Feb 22, 2025

procedural reasoning datasets

Python 441 43 Updated Feb 22, 2025

Minimal hackable GRPO implementation

Python 139 15 Updated Jan 31, 2025

Code for "Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction"

Python 68 2 Updated Feb 14, 2025

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 16,974 1,404 Updated Feb 1, 2025

Deepseek R1 zero tiny version own reproduce on two A100s.

Python 38 16 Updated Feb 1, 2025

Video Generation Foundation Models: https://saiyan-world.github.io/goku/

Python 2,387 240 Updated Feb 19, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,114 706 Updated Feb 22, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 7,818 554 Updated Feb 20, 2025

Train transformer language models with reinforcement learning.

Python 11,890 1,599 Updated Feb 21, 2025

This repo contains the official authors implementation associated with the MeshSplats paper

Python 101 6 Updated Feb 20, 2025

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

178,991 51,194 Updated Aug 21, 2024

A jounery to real multimodel R1 ! We are doing on large-scale experiment

Python 212 2 Updated Feb 12, 2025
Next