Stars
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
🌎💪 BrowserGym, a Gym environment for web task automation
Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.
A curated list of of awesome UI agents resources, encompassing Web, App, OS, and beyond (continually updated)
A bug repository that keeps growing
aider is AI pair programming in your terminal
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.
The Open Cookbook for Top-Tier Code Large Language Model
The paper collections for the autoregressive models in vision.
This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.
Replicating O1 inference-time scaling laws
The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Janus-Series: Unified Multimodal Understanding and Generation Models
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
Scalable data pre processing and curation toolkit for LLMs