Skip to content
View wuzheng2326's full-sized avatar
  • HuaZhong University of Science and Technology

Highlights

  • Pro

Block or report wuzheng2326

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis

Python 84 9 Updated Oct 20, 2024

Codes for KEBR: Knowledge Enhanced Self-Supervised Balanced Representation for Multimodal Sentiment Analysis

Python 4 Updated Aug 10, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 993 65 Updated Nov 20, 2024

Multilingual Voice Understanding Model

Python 3,890 349 Updated Nov 29, 2024

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Python 75 4 Updated Dec 15, 2024
Python 22 1 Updated Sep 13, 2024

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Python 900 125 Updated Apr 12, 2024
Python 31 6 Updated Mar 10, 2023

[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

Python 80 4 Updated Jul 17, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,969 2,305 Updated Aug 12, 2024

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,509 92 Updated Dec 11, 2024

【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Python 249 20 Updated Nov 29, 2024

The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)

Python 30 2 Updated Mar 29, 2024

Fusional approaches for temporal action localization in untrimmed videos

Python 36 7 Updated Mar 17, 2023

Span-based Localizing Network for Natural Language Video Localization (ACL 2020)

Python 104 17 Updated Oct 15, 2021

[NeurIPS 2021] Moment-DETR code and QVHighlights dataset

Python 278 44 Updated Apr 18, 2024

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Python 194 19 Updated Apr 15, 2024

Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Grounding"

Python 46 2 Updated Dec 30, 2023

Awesome Incremental Learning

3,885 578 Updated Jan 2, 2025

ACM Multimedia 2023 - Temporal Sentence in Streaming Videos

Python 8 Updated Dec 6, 2024

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Python 90 8 Updated Nov 16, 2022

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 15,156 1,223 Updated Dec 12, 2024

[AAAI 2024] Prompt-based Distribution Alignment for Unsupervised Domain Adaptation

Python 53 3 Updated Oct 1, 2024

[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Python 83 2 Updated Dec 10, 2024

UniMD: Towards Unifying Moment retrieval and temporal action Detection

Python 40 1 Updated Jul 5, 2024
Python 991 252 Updated Jun 28, 2020

Repository for the CVPR-20 paper "Local-Global Video-Text Interactions for Temporal Grounding"

Python 130 18 Updated Jul 5, 2021
Next