Skip to content
View XDLiuyyy's full-sized avatar
🌴
On vacation
🌴
On vacation

Block or report XDLiuyyy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling

Python 53 Updated Nov 8, 2024

Code for the paper "PointAttN: You Only Need Attention for Point Cloud Completion"

Jupyter Notebook 96 14 Updated Apr 1, 2024

[ICCV 2021 Oral] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers

Python 632 111 Updated Sep 27, 2024

Papers and Datasets about Point Cloud.

Python 2,548 307 Updated Aug 30, 2024

[MICCAI 2024] TeethDreamer: 3D Teeth Reconstruction from Five Intra-oral Photographs

Python 35 3 Updated Nov 25, 2024

Video Object Segmentation using Space-Time Memory Networks

Python 415 82 Updated Jun 3, 2020

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Python 871 62 Updated Jul 6, 2024

OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.

Python 202 14 Updated Dec 23, 2024

Code for paper titled, "Learning to Predict Task Progress by Self-Supervised Video Alignment" by Gerard Donahue and Ehsan Elhamifar, published at CVPR 2024.

Python 8 Updated Jul 26, 2024

[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)

Python 57 8 Updated Aug 17, 2021

"Interaction-centric Spatio-Temporal Context Reasoning for Muti-Person Video HOI Recognition" ECCV 2024

Python 4 Updated Oct 2, 2024

Official Implementation of STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering, AAAI 2024

Python 5 Updated Feb 9, 2024

Official repository of ECCV 2024 paper - "HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization"

Python 12 1 Updated Aug 23, 2024

[CVPR 2021] Actor-Context-Actor Relation Network for Spatio-temporal Action Localization

Python 211 39 Updated Oct 8, 2021

Video Evnet Extraction via Tracking Visual States of Arguments (AAAI2023)

Python 11 1 Updated Feb 18, 2024

[ECCV 2024 oral] -C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition

Python 29 6 Updated Dec 7, 2024

[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities

Python 62 2 Updated Oct 10, 2024

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Python 714 34 Updated Aug 13, 2024
Jupyter Notebook 9 Updated Jun 21, 2024

Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019

Python 94 18 Updated Aug 9, 2019

[ACL 2024 Findings] LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition

Python 25 Updated Sep 3, 2024

[CVPR 2023] Code for "Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations"

Jupyter Notebook 19 2 Updated Oct 10, 2023

GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)

Jupyter Notebook 62 5 Updated Jan 2, 2024
Python 16 Updated Feb 7, 2024

[TPAMI 2024] This is the Pytorch code for our paper "Context Disentangling and Prototype Inheriting for Robust Visual Grounding".

Python 17 2 Updated Oct 12, 2024

[ICME 2024 Oral] DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding

Python 17 Updated Oct 27, 2024

[TGRS 2024] Language-Guided Progressive Attention for Visual Grounding in Remote Sensing Images.

Python 31 3 Updated Nov 7, 2024

[ACM MM 2024] Hierarchical Multimodal Fine-grained Modulation for Visual Grounding.

Python 34 4 Updated Oct 18, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 20,901 2,303 Updated Aug 12, 2024
Next