Skip to content
View baohl00's full-sized avatar
:octocat:
Untitled
:octocat:
Untitled

Block or report baohl00

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)

Python 131 14 Updated Oct 1, 2024

A ComfyUI extension for chatting with your images with LLaVA. Runs locally, no external services, no filter.

Python 123 14 Updated Aug 3, 2024

Collection of Composed Image Retrieval (CIR) papers.

145 8 Updated Mar 4, 2025

SEED-Story: Multimodal Long Story Generation with Large Language Model

Python 798 60 Updated Oct 11, 2024

SEED-Story is a JAX/Flax implementation of a multimodal story generation model based on the paper "SEED-Story: Multimodal Long Story Generation with Large Language Model". This model combines visio…

Python 1 Updated Feb 15, 2025

【NeurIPS 2024】Dense Connector for MLLMs

Python 156 7 Updated Oct 14, 2024

Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

Python 1,106 89 Updated Dec 12, 2023

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

Jupyter Notebook 160 4 Updated Dec 7, 2024

[SIGIR'2024 Best Paper Honorable Mention] Official repository for "LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval"

Python 49 5 Updated Feb 10, 2025

[ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion

Python 168 9 Updated May 7, 2024

[ICML'24 Oral] "MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions"

Python 164 13 Updated Oct 28, 2024
Jupyter Notebook 201 15 Updated Sep 4, 2024

🔍 Explore Egocentric Vision: research, data, challenges, real-world apps. Stay updated & contribute to our dynamic repository! Work-in-progress; join us!

101 9 Updated Nov 23, 2024

TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.

Python 2,117 183 Updated Feb 28, 2025
Python 66 5 Updated Dec 6, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,012 95 Updated Jan 26, 2025

Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥

Python 33,543 2,324 Updated Mar 5, 2025

In-the-wild Question Answering

Python 15 2 Updated May 10, 2023

Explore VLM-Eval, a framework for evaluating Video Large Language Models, enhancing your video analysis with cutting-edge AI technology.

Python 33 2 Updated Jan 20, 2024
Python 345 35 Updated May 25, 2024

The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling

Python 714 55 Updated Nov 25, 2024
Jupyter Notebook 610 53 Updated Jan 16, 2025

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Jupyter Notebook 2,657 357 Updated Mar 2, 2025

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,267 510 Updated May 3, 2024

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

601 41 Updated Mar 6, 2025

Awesome List of Vision Language Prompt Papers

43 1 Updated Nov 9, 2023

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 47,573 5,053 Updated Jan 22, 2025

Official code for the Paper "RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance"

Python 90 12 Updated Feb 21, 2025

MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU

Python 46 Updated Sep 29, 2023
Next