Lists (3)
Sort Name ascending (A-Z)
Stars
Official repository of Uni-AdaFocus (TPAMI 2024).
Jax/Flax implementation of DeiT and DeiT-III (ViT)
Greedy Local Learning with Context Supply for training deep networks.
[TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition
[IEEE TIP] Fine-grained Recognition with Learnable Semantic Data Augmentation
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024
Official repository of Agent Attention (ECCV2024)
Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)
[NeurIPS 2023] Rank-DETR for High Quality Object Detection
[ICCV 2023] Adaptive Rotated Convolution for Rotated Object Detection
[IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Official repository of FLatten Transformer (ICCV2023)
Official implementation of Dynamic Perceiver
Official repository of ActiveNeRF (ECCV2022)
Pytorch implementation of DAPrompt: https://arxiv.org/abs/2202.06687
Official implementation of A Mixture of Surprises for Unsupervised Reinforcement Learning
[ECCV 2022] Learning to Weight Samples for Dynamic Early-exiting Networks
[NeurIPS 2022] Latency-aware Spatial-wise Dynamic Networks
[ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)
[arXiv] Cross-Modal Adapter for Text-Video Retrieval
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.
A curated reading list of research in Mixture-of-Experts(MoE).
A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding