Skip to content
View minjoong507's full-sized avatar
😀
😀

Block or report minjoong507

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This is a repository contains the implementation of our NeurIPS'24 paper "Temporal Sentence Grounding with Relevance Feedback in Videos"

Python 6 Updated Dec 10, 2024

Official Repository of the paper "On the Consistency of Video Large Language Models in Temporal Comprehension".

3 Updated Jan 15, 2025

📎 + 🦾 CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision

Python 9 1 Updated Nov 8, 2024

This is a collection of research papers for Self-Correcting Large Language Models with Automated Feedback.

486 28 Updated Oct 28, 2024

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,273 110 Updated Aug 27, 2024

SelecMix: Debiased Learning by Contradicting-pair Sampling (NeurIPS 2022)

Python 12 2 Updated Jun 5, 2024

Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning (ICML 2024)

Python 11 1 Updated Jun 5, 2024

FreeVA: Offline MLLM as Training-Free Video Assistant

Python 54 Updated Jun 9, 2024

A trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".

1,985 132 Updated Oct 5, 2023

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Python 889 66 Updated Jul 6, 2024

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Python 3,144 257 Updated Jan 18, 2025

Inference code for Llama models

Python 57,267 9,662 Updated Aug 18, 2024

ChatGPT, GenerativeAI and LLMs Timeline

947 59 Updated May 19, 2024

[CVPR'2024 Highlight] Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".

Python 244 12 Updated Jun 13, 2024

Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"

Jupyter Notebook 1,667 120 Updated Jan 29, 2024

VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT

Python 94 6 Updated Aug 26, 2024

Code release for "Learning Video Representations from Large Language Models"

Python 499 45 Updated Oct 1, 2023

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

Python 326 30 Updated Nov 19, 2024
Python 56 Updated Apr 24, 2024

[EMNLP 2022] Official Pytorch code for "Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval"

Python 9 Updated May 28, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 21,163 2,327 Updated Aug 12, 2024

Can I Trust Your Answer? Visually Grounded Video Question Answering (CVPR'24, Highlight)

Python 63 1 Updated Jul 1, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,836 90 Updated Jan 15, 2025

Code release for ActionFormer (ECCV 2022)

Python 452 81 Updated Apr 11, 2024

Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023

Python 263 15 Updated Jun 7, 2023

Official pytorch repository for CG-DETR "Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding"

Python 121 11 Updated Aug 21, 2024

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,802 377 Updated Mar 14, 2024

[WACV 2025] Official Pytorch code for "Background-aware Moment Detection for Video Moment Retrieval"

Python 14 Updated Oct 2, 2024

[IROS 2023] GVCCI: Lifelong Learning of Visual Grounding for Language-Guided Robotic Manipulation

Python 6 Updated Apr 23, 2024

🌈 PyTorch Implementation for EMNLP'21 Findings "Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer"

Python 13 4 Updated Feb 1, 2023
Next