Skip to content
View sigh23333's full-sized avatar

Block or report sigh23333

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[ECCV2024] Official implementation of Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes

Python 77 10 Updated Sep 21, 2024

Student Classroom Behavior dataset

Python 243 22 Updated Dec 13, 2024

[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement

Python 30 4 Updated Sep 27, 2023

[CVPR 2021] Actor-Context-Actor Relation Network for Spatio-temporal Action Localization

Python 211 39 Updated Oct 8, 2021

Spatio-Temporal Action Localization System

Python 414 75 Updated May 21, 2022

Hiera: A fast, powerful, and simple hierarchical vision transformer.

Python 939 47 Updated Mar 2, 2024

Code repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)

Jupyter Notebook 258 32 Updated Jan 19, 2024

We have implemented Track # 1 for ICME 2024: Spatial Action Localization on Chaotic World dataset. Our mAP on the validation set reaches 26.62%, and if we directly use officially provided chaos_tes…

Python 11 3 Updated Nov 11, 2024

Context-based Dialogue Act Recognition using Recurrent Neural Networks

Python 13 2 Updated Nov 13, 2021

Switchboard Dialog Act Corpus with Penn Treebank links

Python 142 40 Updated Dec 30, 2020

The second generation of YOWO action detector.

Python 227 32 Updated May 9, 2024

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Python 195 12 Updated Sep 16, 2024

The TalkMoves Dataset: K-12 mathematics lesson transcripts annotated for teacher and student discursive moves

Jupyter Notebook 25 5 Updated Feb 4, 2022

Custom ava dataset, Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions

Python 111 19 Updated Jun 7, 2022

Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)

Python 9,784 1,390 Updated Jul 31, 2023

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,792 6,442 Updated Jan 9, 2025

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 6,647 806 Updated Jan 9, 2025

Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"

Python 93 3 Updated Jan 5, 2025

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Python 6,732 1,226 Updated Nov 26, 2024

EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important temporal segments in educational videos.

Python 20 4 Updated Mar 8, 2024

Video datasets

1,282 96 Updated Mar 8, 2023

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,261 110 Updated Aug 27, 2024

Video Annotation Tool

Vue 181 20 Updated Jun 18, 2024

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Jupyter Notebook 118 15 Updated Sep 29, 2023

Code release for "Learning Video Representations from Large Language Models"

Python 499 45 Updated Oct 1, 2023

Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"

Python 90 3 Updated Aug 6, 2024

[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Python 85 2 Updated Dec 10, 2024

Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"

Python 47 4 Updated Jul 6, 2023

[CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

Jupyter Notebook 150 8 Updated Oct 3, 2024
Python 79 6 Updated Jul 30, 2024
Next