LaFeuilleMorte

LaFeuilleMorte

Stars

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36,905 4,254 Updated Feb 21, 2025

datawhalechina / unlock-deepseek

DeepSeek 系列工作解读、扩展和复现。

Python 519 41 Updated Feb 15, 2025

boyu-ai / Hands-on-RL

https://hrl.boyuai.com/

Jupyter Notebook 2,949 593 Updated Nov 22, 2022

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,369 1,038 Updated Feb 20, 2025

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,429 1,743 Updated Feb 19, 2025

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,188 165 Updated Feb 21, 2025

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 10,498 1,025 Updated Feb 22, 2025

sail-sg / oat-zero

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 163 10 Updated Feb 6, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 38,890 5,821 Updated Feb 22, 2025

stepfun-ai / Step-Video-T2V

Python 1,860 137 Updated Feb 20, 2025

ggml-org / llama.cpp

LLM inference in C/C++

C++ 74,990 10,834 Updated Feb 22, 2025

datawhalechina / easy-rl

强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/

Jupyter Notebook 10,334 1,946 Updated Feb 20, 2025

DLR-RM / rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

Python 2,253 533 Updated Feb 14, 2025

modelscope / awesome-deep-reasoning

Collect every awesome work about r1!

Python 193 5 Updated Feb 20, 2025

brendanhogan / DeepSeekRL-Extended

Exploring Applications of GRPO

Python 95 9 Updated Feb 16, 2025

aikorea / awesome-rl

Reinforcement learning resources curated

8,975 1,828 Updated May 25, 2023

sweetice / Deep-reinforcement-learning-with-pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

Python 4,118 870 Updated Mar 24, 2023

unslothai / unsloth

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 31,475 2,099 Updated Feb 22, 2025

open-thought / reasoning-gym

procedural reasoning datasets

Python 441 43 Updated Feb 22, 2025

open-thought / tiny-grpo

Minimal hackable GRPO implementation

Python 139 15 Updated Jan 31, 2025

denghilbert / Self-Cali-GS

Code for "Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction"

Python 68 2 Updated Feb 14, 2025

QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 16,974 1,404 Updated Feb 1, 2025

JerryWu-code / TinyZero

Forked from Jiayi-Pan/TinyZero

Deepseek R1 zero tiny version own reproduce on two A100s.

Python 38 16 Updated Feb 1, 2025

Saiyan-World / goku

Video Generation Foundation Models: https://saiyan-world.github.io/goku/

Python 2,387 240 Updated Feb 19, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,114 706 Updated Feb 22, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 7,818 554 Updated Feb 20, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 11,890 1,599 Updated Feb 21, 2025

gwilczynski95 / meshsplats

This repo contains the official authors implementation associated with the MeshSplats paper

Python 101 6 Updated Feb 20, 2025

CyC2018 / CS-Notes

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

178,991 51,194 Updated Aug 21, 2024

FanqingM / R1-Multimodal-Journey

A jounery to real multimodel R1 ! We are doing on large-scale experiment

Python 212 2 Updated Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly