Stars
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model
High-resolution models for human tasks.
⛹️ Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
【CVPR2024】Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
The official code of "PLIP: Language-Image Pre-training for Person Representation Learning"
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
[NeurIPS 2023] HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
A comprehensive list of awesome contrastive self-supervised learning papers.
Collection of awesome parameter-efficient fine-tuning resources.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Collection of AWESOME vision-language models for vision tasks
[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".
❄️🔥 Visual Prompt Tuning [ECCV 2022] https://arxiv.org/abs/2203.12119
[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"
Paper collection for cloth variation based person re-identification
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
[ECCV2022] PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification
[Mamba-Survey-2024] Paper list for State-Space-Model/Mamba and it's Applications
✨✨Latest Advances on Multimodal Large Language Models