Skip to content
View je1lee's full-sized avatar
🚀
Focusing
🚀
Focusing

Block or report je1lee

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 9,275 1,210 Updated Feb 1, 2025

🚀 [NeurIPS24] Make Vision Matter in Visual-Question-Answering (VQA)! Introducing NaturalBench, a vision-centric VQA benchmark (NeurIPS'24) that challenges vision-language models with simple questio…

Python 64 9 Updated Feb 1, 2025

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 335 14 Updated Jan 13, 2025

Official implementation of the paper: "A deeper look at depth pruning of LLMs"

Python 13 1 Updated Jul 24, 2024

A sparse attention kernel supporting mix sparse patterns

C++ 99 2 Updated Oct 15, 2024

Helpful tools and examples for working with flex-attention

Python 622 34 Updated Feb 8, 2025

X bootstrap 1000+ tools and scripts.

1,503 35 Updated Feb 8, 2025

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,097 141 Updated Feb 6, 2025

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 864 113 Updated Jan 4, 2025

[NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an …

Python 905 44 Updated Jan 31, 2025

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 1,934 237 Updated Jan 20, 2025

[ICLR 2025] LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

Python 82 5 Updated Jan 22, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 9,043 873 Updated Feb 9, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 1,966 198 Updated Feb 9, 2025

Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models

Python 57 3 Updated Nov 1, 2024

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,072 198 Updated Feb 9, 2025

An extremely fast Python package and project manager, written in Rust.

Rust 38,796 1,078 Updated Feb 9, 2025

A PyTorch native library for large model training

Python 3,259 269 Updated Feb 7, 2025

[CVPR 2024] MAPLM: A Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding

Python 121 3 Updated Nov 20, 2023

Use your Neovim like using Cursor AI IDE!

Lua 9,707 382 Updated Feb 9, 2025

📰 Must-read papers and blogs on Speculative Decoding ⚡️

582 27 Updated Feb 9, 2025

Large World Model -- Modeling Text and Video with Millions Context

Python 7,220 555 Updated Oct 19, 2024
Python 50 3 Updated Aug 29, 2024

Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs

Python 601 38 Updated Jan 28, 2025

Efficient Triton Kernels for LLM Training

Python 4,372 261 Updated Feb 9, 2025

A curated list of awesome open-source libraries for production LLM

446 43 Updated Dec 31, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,840 126 Updated Oct 30, 2024

MINT-1T: A one trillion token multimodal interleaved dataset.

795 20 Updated Jul 31, 2024

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,488 234 Updated Feb 7, 2025
Next