Stars
Let your Claude able to think
A bibliography and survey of the papers surrounding o1
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
SEED-Story: Multimodal Long Story Generation with Large Language Model
A library for advanced large language model reasoning
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2…
Retrieval Augmented Generation Generalized Evaluation Dataset
[ICLR 2024 Poster] SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Must-read Papers on Knowledge Editing for Large Language Models.
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Code for the paper "GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos" published at CVPR 2024
Official implementation of state-aware video procedural captioning (ACM MM 2021)
MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"