-
Renmin University of China
- Beijing
-
22:58
(UTC +08:00)
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Everything about the SmolLM & SmolLM2 family of models
DeepSeek-VL: Towards Real-World Vision-Language Understanding
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
A series of technical report on Slow Thinking with LLM
Theorem Proving in Lean 4
Transformer related optimization, including BERT, GPT
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
A GPU-compatible PyTorch implementation of Incremental PCA for memory-efficient dimensionality reduction on large datasets.
🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)
Collection of Reverse Engineering in Large Model
A Survey on Data Selection for Language Models
ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
A simple REPL for Lean 4, returning information about errors and sorries.
Lean theorem proving interface which feels like pen-and-paper proofs.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Simple converter of Mathematica notebooks to markdown.
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
TensorZero creates a feedback loop for optimizing LLM applications — turning production data into smarter, faster, and cheaper models.