Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…

Python 4,966 432 Updated Jan 9, 2025

open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Python 1,644 233 Updated Jan 10, 2025

Yuliang-Liu / Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,865 132 Updated Dec 30, 2024

apple / ml-aim

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.

Python 1,143 54 Updated Nov 22, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,289 403 Updated Aug 7, 2024

UX-Decoder / LLaVA-Grounding

Python 372 14 Updated Jul 29, 2024

PKU-YuanGroup / Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Python 897 43 Updated Oct 16, 2024

InternLM / xtuner

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 4,137 328 Updated Dec 27, 2024

baaivision / Emu

Emu Series: Generative Multimodal Models from BAAI

Python 1,673 85 Updated Sep 27, 2024

Nixtla / neuralforecast

Scalable and user friendly neural 🧠 forecasting algorithms.

Python 3,231 373 Updated Jan 9, 2025

OpenGVLab / unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Python 312 17 Updated May 27, 2024

LC044 / WeChatMsg

提取微信聊天记录，将其导出成HTML、Word、Excel文档永久保存，对聊天记录进行分析生成年度聊天报告，用聊天数据训练专属于个人的AI聊天助手

Python 36,203 3,755 Updated Jan 2, 2025

PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,105 220 Updated Dec 3, 2024

baaivision / EVA

EVA Series: Visual Representation Fantasies from BAAI

Python 2,377 171 Updated Aug 1, 2024

RupertLuo / Valley

The official repository of "Video assistant towards large language model makes everything easy"

Python 215 14 Updated Dec 24, 2024

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,869 265 Updated Jun 4, 2024

X2FD / LVIS-INSTRUCT4V

132 Updated Dec 22, 2023

01-ai / Yi

A series of large language models trained from scratch by developers @01-ai

Jupyter Notebook 7,778 491 Updated Nov 27, 2024

Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,576 242 Updated Mar 5, 2024

THUDM / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,256 425 Updated May 29, 2024

huggingface / accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,154 1,010 Updated Jan 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

徐铭远 xmy0916

Achievements

Achievements

Block or report xmy0916

Stars

levihsu / OOTDiffusion

yg10323 / uniapp-order

xai-org / grok-1

bytedance / VTVQA

JialianW / GRiT

yunlong10 / Awesome-LLMs-for-Video-Understanding

OpenBMB / MiniCPM-V

PKU-YuanGroup / MoE-LLaVA

huangb23 / VTimeLLM

modelscope / ms-swift