Skip to content
View yeppp27's full-sized avatar

Block or report yeppp27

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
36 stars written in Python
Clear filter

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,788 525 Updated Dec 25, 2024

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 2,737 218 Updated Jan 11, 2025

Fast inference from large lauguage models via speculative decoding

Python 630 64 Updated Aug 22, 2024

Implementation of paper Data Engineering for Scaling Language Models to 128K Context

Python 448 29 Updated Mar 19, 2024

③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.

Python 334 23 Updated Aug 12, 2024

(ICML 2024) Alphazero-like Tree-Search can guide large language model decoding and training

Python 251 26 Updated May 26, 2024

E5-V: Universal Embeddings with Multimodal Large Language Models

Python 215 8 Updated Dec 23, 2024

[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"

Python 213 17 Updated Dec 4, 2024

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Python 212 42 Updated Oct 11, 2024

②[CVPR 2024] Low-level visual instruction tuning, with a 200K dataset and a model zoo for fine-tuned checkpoints.

Python 210 10 Updated Aug 12, 2024
Python 206 24 Updated Apr 23, 2024

AnchorAttention: Improved attention for LLMs long-context training

Python 202 6 Updated Jan 15, 2025

Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).

Python 152 4 Updated Sep 27, 2024

[NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'

Python 135 2 Updated Jan 13, 2025

Official implementation for CVPR2023 Paper "Re-IQA : Unsupervised Learning for Image Quality Assessment in the Wild"

Python 101 7 Updated Apr 26, 2024

DepictQA: Depicted Image Quality Assessment with Vision Language Models

Python 101 3 Updated Nov 17, 2024

[EMNLP 2024 Findings🔥] Official implementation of "LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference"

Python 88 6 Updated Nov 9, 2024

[Neurips 24' D&B] Official Dataloader and Evaluation Scripts for LongVideoBench.

Python 82 2 Updated Jul 27, 2024

Codes for ICML 2024 paper: "Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition"

Python 79 4 Updated Dec 3, 2024

Collection of utilities that are not polished implementations but can be useful to users

Python 77 24 Updated Oct 23, 2024

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement

Python 69 22 Updated Oct 30, 2024

A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.

Python 68 3 Updated Oct 14, 2024

[ECCV 2024] Official Pytorch Implementation of A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment

Python 64 1 Updated Jul 20, 2024
Python 51 Updated Dec 13, 2024

Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".

Python 49 Updated Nov 29, 2024
Python 46 3 Updated Nov 19, 2024

MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU

Python 44 Updated Sep 29, 2023

[preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.

Python 39 2 Updated Dec 27, 2024

FocusLLM: Scaling LLM’s Context by Parallel Decoding

Python 33 2 Updated Dec 8, 2024

Official PyTorch implementation of "Scaling Up Personalized Image Aesthetic Assessment via Task Vector Customization" (ECCV 2024)

Python 19 Updated Oct 23, 2024
Next