Skip to content
View xmy0916's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report xmy0916

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official implementation of OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Python 5,959 850 Updated May 13, 2024

外卖点餐系统-uniapp 前台点餐

Vue 66 18 Updated Feb 28, 2022

Grok open release

Python 49,767 8,343 Updated Aug 30, 2024

Towards Video Text Visual Question Answering: Benchmark and Baseline

Python 38 Updated Feb 26, 2024

GRiT: A Generative Region-to-text Transformer for Object Understanding (https://arxiv.org/abs/2212.00280)

Python 309 30 Updated Jan 8, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,791 89 Updated Dec 12, 2024

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 13,033 911 Updated Oct 22, 2024

Mixture-of-Experts for Large Vision-Language Models

Python 2,040 127 Updated Dec 3, 2024

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Python 241 12 Updated Jun 13, 2024

Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…

Python 4,966 432 Updated Jan 9, 2025

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Python 1,644 233 Updated Jan 10, 2025

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,865 132 Updated Dec 30, 2024

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.

Python 1,143 54 Updated Nov 22, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,289 403 Updated Aug 7, 2024
Python 372 14 Updated Jul 29, 2024

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding

Python 897 43 Updated Oct 16, 2024

An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)

Python 4,137 328 Updated Dec 27, 2024

Emu Series: Generative Multimodal Models from BAAI

Python 1,673 85 Updated Sep 27, 2024

Scalable and user friendly neural 🧠 forecasting algorithms.

Python 3,231 373 Updated Jan 9, 2025

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Python 312 17 Updated May 27, 2024

提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手

Python 36,203 3,755 Updated Jan 2, 2025

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 3,105 220 Updated Dec 3, 2024

EVA Series: Visual Representation Fantasies from BAAI

Python 2,377 171 Updated Aug 1, 2024

The official repository of "Video assistant towards large language model makes everything easy"

Python 215 14 Updated Dec 24, 2024

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,869 265 Updated Jun 4, 2024
132 Updated Dec 22, 2023

A series of large language models trained from scratch by developers @01-ai

Jupyter Notebook 7,778 491 Updated Nov 27, 2024

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,576 242 Updated Mar 5, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,256 425 Updated May 29, 2024

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 8,154 1,010 Updated Jan 9, 2025
Next