Stars
Thesis-template-of-Shandong-University
LaTeX templates for papers, please select your conference or journal by switching branches.
WebUI extension for ControlNet
[AAAI 2022] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
Implementation for "Multilevel Language and Vision Integration for Text-to-Clip Retrieval"
Code for the paper: Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
VideoX: a collection of video cross-modal models
Official Tensorflow Implementation of the AAAI-2020 paper "Temporally Grounding Language Queries in Videos by Contextual Boundary-aware Prediction"
Design patterns implemented in Java
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023
[NeurIPS 2021] Moment-DETR code and QVHighlights dataset
[CVPR'23 Highlight] AutoAD: Movie Description in Context.
[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"
Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch
linrongc / youtube-8m
Forked from google/youtube-8mCode of PhoenixLin(3rd place) in the 2nd Youtube8M Video Understanding Challenge
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
PyTorch implementation of soft actor critic
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video (AAAI2020)
Official pytorch implementation of "Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos"
A bidirectional recurrent neural network model with attention mechanism for restoring missing punctuation in unsegmented text
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.
Soldelli / Awesome-Temporal-Language-Grounding-in-Videos
Forked from rookiecm/Awesome-Temporal-Sentence-Grounding-in-VideosA curated list of grounding natural language in video and related area. :-)
TALL: Temporal Activity Localization via Language Query
MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions