Stars
Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Analyze computation-communication overlap in V3/R1.
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Wan: Open and Advanced Large-Scale Video Generative Models
DeepEP: an efficient expert-parallel communication library
A simple screen parsing tool towards pure vision based GUI agent
Make websites accessible for AI agents
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Stay on top of trending topics on social media and the web with AI
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
Janus-Series: Unified Multimodal Understanding and Generation Models
Fully open reproduction of DeepSeek-R1
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Python tool for converting files and office documents to Markdown.
Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence