Stars
DeepSeek 资料大全🔥,DeepSeek 使用,指令指南,应用开发指南,精选资源清单,更好的使用 DeepSeek 让你的生产力 10倍提升! 🚀
[COLING 2025]A curated paper list about LLMs for chemistry
Conformalized Credal Set Predictors (NeurIPS 2024)
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".
Sky-T1: Train your own O1 preview model within $450
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".
A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large Language Model Inference-Time Self-Improvement.
[CVPR 2025] TinyFusion: Diffusion Transformers Learned Shallow
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
A Survey of Attributions for Large Language Models
The first Object-Oriented Programming (OOP) Evaluaion Benchmark for LLMs
A high-throughput and memory-efficient inference and serving engine for LLMs
Official Implementation of ICLR 2024 paper: "Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning"
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
List of papers on Self-Correction of LLMs.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
A bibliography and survey of the papers surrounding o1
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
A new dataset of difficult graduate-level applied mathematics problems; evaluations demonstrate that leading LLMs currently exhibit low accuracy in solving these problems.
Repository for the code and dataset for the paper: "Have LLMs Advanced enough? Towards Harder Problem Solving Benchmarks For Large Language Models"
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.