Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction |
Tracking Multiple Deformable Objects in Egocentric Videos |
Tracking through Containers and Occluders in the Wild |
TarViS: A Unified Approach for Target-based Video Segmentation |
VideoTrack: Learning to Track Objects via Video Transformer |
ARKitTrack: A New Diverse Dataset for Tracking using Mobile RGB-D Data |
A Dynamic Multi-Scale Voxel Flow Network for Video Prediction |
Representation Learning for Visual Object Tracking by Masked Appearance Transfer |
EqMotion: Equivariant Multi-Agent Motion Prediction with Invariant Interaction Reasoning |
Semi-Supervised Video Inpainting with Cycle Consistency Constraints |
Generalized Relation Modeling for Transformer Tracking |
Breaking the Object in Video Object Segmentation |
Unifying Short and Long-Term Tracking with Graph Hierarchies |
Simple Cues Lead to a Strong Multi-Object Tracker |
Unified Mask Embedding and Correspondence Learning for Self-Supervised Video Segmentation |
MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors |
SeqTrack: Sequence to Sequence Learning for Visual Object Tracking |
Joint Visual Grounding and Tracking with Natural Language Specification |
Boosting Video Object Segmentation via Space-Time Correspondence Learning |
Visual Prompt Multi-Modal Tracking |
OVTrack: Open-Vocabulary Multiple Object Tracking |
TransFlow: Transformer as Flow Learner |
Focus on Details: Online Multi-Object Tracking with Diverse Fine-grained Representation |
Autoregressive Visual Tracking |
Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping |
Tangentially Elongated Gaussian Belief Propagation for Event-based Incremental Optical Flow Estimation |
Bridging Search Region Interaction with Template for RGB-T Tracking |
Efficient RGB-T Tracking via Cross-Modality Distillation |
MotionTrack: Learning Robust Short-Term and Long-Term Motions for Multi-Object Tracking |
Self-Supervised AutoFlow |
UTM: A Unified Multiple Object Tracking Model with Identity-Aware Feature Enhancement |
BiFormer: Learning Bilateral Motion Estimation via Bilateral Transformer for 4K Video Frame Interpolation |
Spatial-then-Temporal Self-Supervised Learning for Video Correspondence |
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects |
MED-VT: Multiscale Encoder-Decoder Video Transformer with Application to Object Segmentation |
Context-Aware Relative Object Queries to Unify Video Instance and Panoptic Segmentation |
Unsupervised Space-Time Network for Temporally-Consistent Segmentation of Multiple Motions |
Resource-Efficient RGBD Aerial Tracking |
MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding |
Streaming Video Model |
Weakly Supervised Class-Agnostic Motion Prediction for Autonomous Driving |
LSTFE-Net: Long Short-Term Feature Enhancement Network for Video Small Object Detection |
DistractFlow: Improving Optical Flow Estimation via Realistic Distractions and Pseudo-Labeling |
SCOTCH and SODA: A Transformer Video Shadow Detection Framework |
ZBS: Zero-Shot Background Subtraction via Instance-Level Background Modeling and Foreground Selection |
Frame-Event Alignment and Fusion Network for High Frame Rate Tracking |