Awesome-MM-LLM

Multimodal Large Language Models

Surveys
Vision
Audios
Any-to-Any
MM-LLM with Robotics
Datasets

Surveys

[MM-LLM] "A Survey on Multimodal Large Language Models", arXiv, June 2023. [Paper] [Website]
[LAM] "Sparks of Large Audio Models:A Survey and Outlook", arXiv, Sep 2023. [Paper] [Website]

Vision

[CLIP] "Learning transferable visual models from natural language supervision", arXiv, Feb 2021. [Paper] [Website]
[BLIP] "BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation", arXiv, Jan 2022. [Paper] [Website]
[BLIP-2] "BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models", arXiv, Jan 2022. [Paper] [Website]
[MiniGPT-4] "MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models ", arXiv, Apr 2023. [Paper] [Website]
[Instruct-BLIP] "InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning", arXiv, May 2023. [Paper] [Website]
[LLaVA] "Visual Instruction Tuning", arXiv, Apr 2023. [Paper] [Website]

Audios

[AudioGPT] "AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head", arXiv, Apr 2023. [Paper] [Website]

Any-to-Any

[NExT-GPT] "NExT-GPT: Any-to-Any Multimodal LLM", arXiv, Sep 2023. [Paper] [Website]

MM-LLM with Robotics

[PaLM-E] "PaLM-E: An Embodied Multimodal Language Model", arXiv, Feb 2021. [Paper] [Website]

Datasets

[LAION-5B][Website]
[LAION-COCO][Website]
[MMC4][Website]

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
LLMs-Robotics.drawio		LLMs-Robotics.drawio
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome-MM-LLM

Multimodal Large Language Models

Surveys

Vision

Audios

Any-to-Any

MM-LLM with Robotics

Datasets

About

Releases

Packages

jaswu51/LLMs-Robotics

Folders and files

Latest commit

History

Repository files navigation

Awesome-MM-LLM

Multimodal Large Language Models

Surveys

Vision

Audios

Any-to-Any

MM-LLM with Robotics

Datasets

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages