Starred repositories
Open Thoughts: Fully Open Data Curation for Thinking Models
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis"
Witness the aha moment of VLM with less than $3.
Make any LLM to think like OpenAI o1 and deepseek R1
A library for advanced large language model reasoning
🐭 A tiny single-file implementation of Group Relative Policy Optimization (GRPO) as introduced by the DeepSeekMath paper
RAG Web UI is an intelligent dialogue system based on RAG (Retrieval-Augmented Generation) technology.
Everything you need to build state-of-the-art foundation models, end-to-end.
Open replication of DeepSeek R1 for text-to-graph extraction.
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
RAGEN is the first open-source reproduction of DeepSeek-R1 on AGENT training.
Synthetic Data curation for post-training and structured data extraction
This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-e…
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Fully open reproduction of DeepSeek-R1