sigh23333

sigh23333

Starred repositories

FelixCaae / CrowdSAM

[ECCV2024] Official implementation of Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes

Python 77 10 Updated Sep 21, 2024

Whiffe / SCB-dataset

Student Classroom Behavior dataset

Python 243 22 Updated Dec 13, 2024

MCG-NJU / EVAD

[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement

Python 30 4 Updated Sep 27, 2023

Siyu-C / ACAR-Net

[CVPR 2021] Actor-Context-Actor Relation Network for Spatio-temporal Action Localization

Python 211 39 Updated Oct 8, 2021

MVIG-SJTU / AlphAction

Spatio-Temporal Action Localization System

Python 414 75 Updated May 21, 2022

facebookresearch / hiera

Hiera: A fast, powerful, and simple hierarchical vision transformer.

Python 939 47 Updated Mar 2, 2024

brjathu / LART

Code repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)

Jupyter Notebook 258 32 Updated Jan 19, 2024

jolin830 / SlowFast-Meet-ViT

We have implemented Track # 1 for ICME 2024: Spatial Action Localization on Chaotic World dataset. Our mAP on the validation set reaches 26.62%, and if we directly use officially provided chaos_tes…

Python 11 3 Updated Nov 11, 2024

bothe / dialogue-act-recognition

Context-based Dialogue Act Recognition using Recurrent Neural Networks

Python 13 2 Updated Nov 13, 2021

cgpotts / swda

Switchboard Dialog Act Corpus with Penn Treebank links

Python 142 40 Updated Dec 30, 2020

yjh0410 / YOWOv2

The second generation of YOWO action detector.

Python 227 32 Updated May 9, 2024

apple / ml-slowfast-llava

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Python 195 12 Updated Sep 16, 2024

SumnerLab / TalkMoves

The TalkMoves Dataset: K-12 mathematics lesson transcripts annotated for teacher and student discursive moves

Jupyter Notebook 25 5 Updated Feb 4, 2022

Whiffe / Custom-ava-dataset_Custom-Spatio-Temporally-Action-Video-Dataset

Custom ava dataset, Multi-Person Video Dataset Annotation Method of Spatio-Temporally Actions

Python 111 19 Updated Jun 7, 2022

ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）

Python 9,784 1,390 Updated Jul 31, 2023

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,792 6,442 Updated Jan 9, 2025

pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook 6,647 806 Updated Jan 9, 2025

pipixin321 / HolmesVAD

Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"

Python 93 3 Updated Jan 5, 2025

facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Python 6,732 1,226 Updated Nov 26, 2024

VideoAnalysis / EDUVSUM

EDUVSUM is a multimodal neural architecture that utilizes state-of-the-art audio, visual and textual features to identify important temporal segments in educational videos.

Python 20 4 Updated Mar 8, 2024

xiaobai1217 / Awesome-Video-Datasets

Video datasets

1,282 96 Updated Mar 8, 2023

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,261 110 Updated Aug 27, 2024

anucvml / vidat

Video Annotation Tool

Vue 181 20 Updated Jun 18, 2024

antoyang / just-ask

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Jupyter Notebook 118 15 Updated Sep 29, 2023

facebookresearch / LaViLa

Code release for "Learning Video Representations from Large Language Models"

Python 499 45 Updated Oct 1, 2023

Ziyang412 / VideoTree

Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"

Python 90 3 Updated Aug 6, 2024

gyxxyg / VTG-LLM

[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Python 85 2 Updated Dec 10, 2024

salesforce / paprika

Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"

Python 47 4 Updated Jul 6, 2023

evalcrafter / EvalCrafter

[CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

Jupyter Notebook 150 8 Updated Oct 3, 2024

ziplab / LongVLM

Python 79 6 Updated Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly