Stars
Code for the paper "Spatial-Temporal Multi-Cuts for Online Multiple-Camera Vehicle Tracking"
FastPoseGait is a user-friendly and flexible repository that aims to help researchers get started on pose-based gait recognition quickly.
[AAAI 2024] UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation. UCMCTrack achieves SOTA on MOT17 using estimated camera parameters.
Code to reproduce the experiments described in "Do We Still Need Non-Maximum Suppression? Accurate Confidence Estimates and Implicit Duplication Modeling with IoU-Aware Calibration" (https://arxiv.…
TUM Traffic Dataset Development Kit
[ECCV 2020] Codes and MultiviewX dataset for "Multiview Detection with Feature Perspective Transformation".
Proof-of-concept to export your planned tours from Komoot
The official code for the paper: https://openreview.net/forum?id=_PHymLIxuI
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
[WACVW 2023] A massive synthetic dataset for 3D multi-target multi-camera tracking and segmentation.
[NeurIPS2022] Official implementation of PeRFception: Perception using Radiance Fields.
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
Official implementation of "ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos" (ACM ICMRW 2021)
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
A curated list of temporal action localization/detection and related area (e.g. temporal action proposal) resources.
Graphormer is a general-purpose deep learning backbone for molecular modeling.
Official repository for A Coarse-to-Fine Dual Attention Network for Blind Face Completion
Code for "The Box Size Confidence Bias Harms Your Object Detector" (https://arxiv.org/abs/2112.01901)
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
Scripts for downloading the AVA (Atomic Visual Actions) dataset https://research.google.com/ava/ and do postprocessing of it.
Official code of ECCV 2020 paper "GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision". GSNet performs joint vehicle pose estimation and vehicle shape re…
Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.