Skip to content
View donghong1's full-sized avatar

Block or report donghong1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
89 results for source starred repositories
Clear filter

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw

Python 335 61 Updated Dec 6, 2024

LLaMA 2 implemented from scratch in PyTorch

Python 269 51 Updated Sep 25, 2023

中文nlp解决方案(大模型、数据、模型、训练、推理)

Jupyter Notebook 3,093 374 Updated Dec 17, 2024

Video+code lecture on building nanoGPT from scratch

Python 3,711 521 Updated Aug 13, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 38,027 6,092 Updated Dec 9, 2024

iLLaVA: An Image is Worth Fewer Than 1/3 Input Tokens in Large Multimodal Models

Python 11 1 Updated Dec 20, 2024

「大模型」3小时从0训练27M参数的视觉多模态VLM,个人显卡即可推理训练!

Python 444 46 Updated Dec 13, 2024

Inference code for Llama models

Python 56,910 9,622 Updated Aug 18, 2024

Code release for VTW (AAAI 2025)

Python 27 Updated Dec 10, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 37,303 4,573 Updated Dec 20, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 40,208 4,266 Updated Jul 28, 2024

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 1,840 222 Updated Dec 13, 2024

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,498 2,573 Updated Dec 22, 2024

Making LLaVA Tiny via MoE-Knowledge Distillation

Python 72 4 Updated Oct 24, 2024

Pruning the VLLMs

Python 71 3 Updated Dec 9, 2024

解锁HuggingFace生态的百般用法

HTML 79 12 Updated Dec 14, 2024

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 48,209 5,702 Updated Sep 18, 2024

[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models

Python 106 Updated Sep 12, 2024

OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,345 50 Updated Dec 11, 2024

[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"

Python 201 10 Updated Nov 19, 2024

LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning

Python 104 9 Updated Apr 16, 2024

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Python 1,942 130 Updated Jul 2, 2024

Contextual Object Detection with Multimodal Large Language Models

Python 208 5 Updated Oct 14, 2024
Python 17 1 Updated Dec 3, 2024

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

Python 208 24 Updated Dec 16, 2024

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Python 7,038 712 Updated Aug 12, 2024

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,474 91 Updated Dec 11, 2024

A model compression and acceleration toolbox based on pytorch.

Python 329 40 Updated Jan 12, 2024

收集和梳理垂直领域的开源模型、数据集及评测基准。

2,308 181 Updated Dec 26, 2023
Python 2,167 244 Updated Dec 20, 2024
Next