mllms

Here are 15 public repositories matching this topic...

UCSC-VLAA / MedTrinity-25M

[ICLR 2025] This is the official repository of our paper "MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine“

dataset multimodality mllms

Updated Feb 26, 2025
Python

OS-Agent-Survey / OS-Agent-Survey

Star

This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".

agent gui survey operator web-agent browser-agent llms phone-use mllms computer-use gui-agent os-agent os-agent-survey computing-devices computer-using-agent computer-using

Updated Feb 18, 2025

yuanze-lin / Olympus

Star

[CVPR 2025] The official code for "Olympus: A Universal Task Router for Computer Vision Tasks"

chatbot pytorch deeplearning multimodal multi-modality foundation-models llms chatgpt instruction-tuning vision-language-model llava mllms

Updated Mar 5, 2025

HJYao00 / Awesome-Reasoning-MLLM

Star

Awesome Reasoning in MLLMs: Papers and Projects about learning to reason with MLLMs, including Chain-of-Thought (CoT), OpenAl o1, and DeepSeek-R1

reasoning r1 cot o1 mllms

Updated Mar 18, 2025

aim-uofa / SegAgent

Star

[CVPR2025] SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

agent segment-anything vlms mllms

Updated Mar 12, 2025

VILA-Lab / M-Attack

Star

A Simple Baseline Achieving Over 90% Success Rate Against the Strong Black-box Models of GPT-4.5/4o/o1. Paper at: https://arxiv.org/abs/2503.10635

attack adversarial-attack lvlms mllms

Updated Mar 19, 2025
Python

JaaackHongggg / WorldSense

Star

WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

omnimodal mllms

Updated Feb 25, 2025
JavaScript

HashmatShadab / Robust-LLaVA

Star

Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models

robustness adversarial-attacks mllms

Updated Feb 8, 2025
Python

XduSyL / EventGPT

Star

EventGPT: Event Stream Understanding with Multimodal Large Language Models

chatbot representation-learning mutilmodel foundation-models llm mllms event-stream-understanding eventgpt event-language

Updated Dec 5, 2024

924973292 / IDEA

Star

【CVPR2025】IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

caption thermal-imaging multi-modal reid multi-modal-learning mllms

Updated Mar 14, 2025
Python

xuyang-liu16 / GlobalCom2

Star

Global Compression Commander: Plug-and-Play Inference Acceleration for High-Resolution Large Vision-Language Models

multi-modal model-compression large-language-models llm token-reduction mllms

Updated Mar 21, 2025
Python

sun-hailong / TVC

Star

🎉 The code repository for "Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning" in PyTorch.

reasoning r1 cot forgetting mllms multimodel-large-language-model

Updated Mar 18, 2025
Python

PanguIR / MRAGSurvey

Star

The official GitHub page for the survey paper "A Survey on Multimodal Retrieval-Augmented Generation".

multimodal-retrieval large-language-models llms multimodal-generation multimodal-large-language-models retrieved-augmented-generation mllms multimodal-retrieval-augmented-generation

Updated Mar 14, 2025

ErFer7 / Computer-Vision

Star

Este repositório contém o código do trabalho desenvolvido para a disciplina de Visão Computacional (INE410121) da UFSC.

lora llms qlora mllms

Updated Jan 2, 2025
Jupyter Notebook

simoncwang / MMO

Star

Multimodal Multi-agent Organization and Benchmarking

benchmarking llms mllms mllm-evaluation

Updated Dec 27, 2024
Python

Improve this page

Add a description, image, and links to the mllms topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mllms topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mllms

Here are 15 public repositories matching this topic...

UCSC-VLAA / MedTrinity-25M

OS-Agent-Survey / OS-Agent-Survey

yuanze-lin / Olympus

HJYao00 / Awesome-Reasoning-MLLM

aim-uofa / SegAgent

VILA-Lab / M-Attack

JaaackHongggg / WorldSense

HashmatShadab / Robust-LLaVA

XduSyL / EventGPT

924973292 / IDEA

xuyang-liu16 / GlobalCom2

sun-hailong / TVC

PanguIR / MRAGSurvey

ErFer7 / Computer-Vision

simoncwang / MMO

Improve this page

Add this topic to your repo