Skip to content
View wuzheng2326's full-sized avatar
  • HuaZhong University of Science and Technology

Highlights

  • Pro

Block or report wuzheng2326

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
44 results for source starred repositories
Clear filter

[ECCV2024] The official implementation of "Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation".

Python 7 1 Updated Nov 3, 2024

[EMNLP 2024] Official code for "Enhancing Temporal Modeling of Video LLMs via Time Gating"

Python 6 Updated Oct 10, 2024

Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)

Python 218 15 Updated Nov 21, 2023

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis

Python 84 9 Updated Oct 20, 2024

Codes for KEBR: Knowledge Enhanced Self-Supervised Balanced Representation for Multimodal Sentiment Analysis

Python 4 Updated Aug 10, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,028 67 Updated Jan 10, 2025

Multilingual Voice Understanding Model

Python 4,138 366 Updated Jan 8, 2025

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Python 75 4 Updated Dec 15, 2024
Python 25 2 Updated Sep 13, 2024

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Python 904 125 Updated Apr 12, 2024
Python 31 6 Updated Mar 10, 2023

[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

Python 83 4 Updated Jul 17, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 21,157 2,327 Updated Aug 12, 2024

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,562 97 Updated Jan 17, 2025

【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Python 230 20 Updated Nov 29, 2024

The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)

Python 29 2 Updated Mar 29, 2024

Fusional approaches for temporal action localization in untrimmed videos

Python 36 7 Updated Mar 17, 2023

Span-based Localizing Network for Natural Language Video Localization (ACL 2020)

Python 104 17 Updated Oct 15, 2021

[NeurIPS 2021] Moment-DETR code and QVHighlights dataset

Python 282 45 Updated Apr 18, 2024

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Python 195 19 Updated Apr 15, 2024

Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Grounding"

Python 46 2 Updated Dec 30, 2023

Awesome Incremental Learning

3,903 579 Updated Jan 2, 2025

ACM Multimedia 2023 - Temporal Sentence in Streaming Videos

Python 8 Updated Dec 6, 2024

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Python 90 8 Updated Nov 16, 2022

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 15,419 1,245 Updated Dec 12, 2024

[AAAI 2024] Prompt-based Distribution Alignment for Unsupervised Domain Adaptation

Python 55 3 Updated Oct 1, 2024

[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Python 86 2 Updated Dec 10, 2024
Next