wuzheng2326

Zheng Wu wuzheng2326

I'm a C.S student at HuaZhong University of Science and Technology(HUST)

2 followers · 3 following

HuaZhong University of Science and Technology

Highlights

Stars

Haoyu-ha / ALMT

Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis

Python 84 9 Updated Oct 20, 2024

aoqzhu / KEBR

Codes for KEBR: Knowledge Enhanced Self-Supervised Balanced Representation for Multimodal Sentiment Analysis

Python 4 Updated Aug 10, 2024

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 993 65 Updated Nov 20, 2024

FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model

Python 3,890 349 Updated Nov 29, 2024

WHB139426 / Grounded-Video-LLM

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

Python 75 4 Updated Dec 15, 2024

minghangz / TFVTG

Python 22 1 Updated Sep 13, 2024

ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Python 900 125 Updated Apr 12, 2024

GenjiB / ECLIPSE

Python 31 6 Updated Mar 10, 2023

EasonXiao-888 / UVCOM

[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

Python 80 4 Updated Jul 17, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,969 2,305 Updated Aug 12, 2024

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,509 92 Updated Dec 11, 2024

whwu95 / Cap4Video

【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Python 249 20 Updated Nov 29, 2024

lntzm / MESM

The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)

Python 30 2 Updated Mar 29, 2024

skelemoa / tal-hmo

Fusional approaches for temporal action localization in untrimmed videos

Python 36 7 Updated Mar 17, 2023

qiuqiangkong / audioset_tagging_cnn

Python 1,385 258 Updated Jul 25, 2024

26hzhang / VSLNet

Span-based Localizing Network for Natural Language Video Localization (ACL 2020)

Python 104 17 Updated Oct 15, 2021

jayleicn / moment_detr

[NeurIPS 2021] Moment-DETR code and QVHighlights dataset

Python 278 44 Updated Apr 18, 2024

qiuqiangkong / panns_inference

Python 207 31 Updated Mar 5, 2024

TencentARC / UMT

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.

Python 194 19 Updated Apr 15, 2024

ai-dawang / PlugNPlay-Modules

Python 2,744 235 Updated Dec 27, 2024

hlchen23 / ADPN-MM

Repository for 23'MM accepted paper "Curriculum-Listener: Consistency- and Complementarity-Aware Audio-Enhanced Temporal Sentence Grounding"

Python 46 2 Updated Dec 30, 2023

xialeiliu / Awesome-Incremental-Learning

Awesome Incremental Learning

3,885 578 Updated Jan 2, 2025

SCZwangxiao / TSGVs-MM2023

ACM Multimedia 2023 - Temporal Sentence in Streaming Videos

Python 8 Updated Dec 6, 2024

MCG-NJU / MMN

[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Python 90 8 Updated Nov 16, 2022

QwenLM / Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 15,156 1,223 Updated Dec 12, 2024

BaiShuanghao / Prompt-based-Distribution-Alignment

[AAAI 2024] Prompt-based Distribution Alignment for Unsupervised Domain Adaptation

Python 53 3 Updated Oct 1, 2024

gyxxyg / VTG-LLM

[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Python 83 2 Updated Dec 10, 2024

yingsen1 / UniMD

UniMD: Towards Unifying Moment retrieval and temporal action Detection

Python 40 1 Updated Jul 5, 2024

piergiaj / pytorch-i3d

Python 991 252 Updated Jun 28, 2020

JonghwanMun / LGI4temporalgrounding

Repository for the CVPR-20 paper "Local-Global Video-Text Interactions for Temporal Grounding"

Python 130 18 Updated Jul 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly