Name	Name	Last commit message	Last commit date
Latest commit History 129 Commits
README.md	README.md

CVPR-2022-Papers

官网链接：https://cvpr2022.thecvf.com/

开会时间：2022年6月19日-6月24日

❣❣❣近日，CVPR 2022 接收论文公布！总计2067篇！，部分预印版论文也陆续发布中，本文档也将持续收录更新，多多关注!!

❣❣❣另外打包下载所有论文，可在【我爱计算机视觉】微信公众号后台回复“paper”。截止6月13日，已收录 733+2 篇。

历年综述论文分类汇总戳这里↘️CV-Surveys施工中~~~~~~~~~~

2022 年论文分类汇总戳这里

↘️CVPR-2022-Papers ↘️WACV-2022-Papers

2021年论文分类汇总戳这里

↘️ICCV-2021-Papers ↘️CVPR-2021-Papers

2020 年论文分类汇总戳这里

↘️CVPR-2020-Papers ↘️ECCV-2020-Papers

🐱	🐶	🐯	🐺
1.其它	2.Image Segmentation(图像分割)	3.Image Progress(图像处理)	4.Image Captioning(图像字幕)
5.Object Detection(目标检测)	6.Object Tracking(目标跟踪)	7.Point Cloud(点云)	8.Action Detection(人体动作检测与识别)
9.Human Pose Estimation(人体姿态估计)	10.3D(三维视觉)	11.Face	12.Image-to-Image Translation(图像到图像翻译)
13.GAN	14.Video	15.Transformer	16.Semi/self-supervised learning(半/自监督)
17.Medical Image(医学影像)	18.Person Re-Identification(人员重识别)	19.Neural Architecture Search(神经架构搜索)	20.Autonomous vehicles(自动驾驶)
21.UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)	22.Image Synthesis/Generation(图像合成)	23.Image Retrieval(图像检索)	24.Super-Resolution(超分辨率)
25.Fine-Grained/Image Classification(细粒度/图像分类)	26.GCN/GNN	27.Pose Estimation(物体姿势估计)	28.Style Transfer(风格迁移)
29.Augmented Reality/Virtual Reality/Robotics(增强/虚拟现实/机器人)	30.Visual Answer Questions(视觉问答)	31.Vision-Language(视觉语言)	32.Data Augmentation(数据增强)
33.Human-Object Interaction(人物交互)	34.Model Compression/Knowledge Distillation/Pruning(模型压缩/知识蒸馏/剪枝)	35.OCR	36.Optical Flow(光流估计)
37.Contrastive Learning(对比学习)	38.Meta-Learning(元学习)	39.Continual Learning(持续学习)	40.Adversarial Learning(对抗学习)
41.Incremental Learning(增量学习)	42.Metric Learning(度量学习)	43.Multi-Task Learning(多任务学习)	44.Federated Learning(联邦学习)
45.Dense Prediction(密集预测)	46.Scene Graph Generation(场景图生成)	47.Few/Zero-Shot Learning/Domain Generalization/Adaptation(小/零样本/域泛化/适应)	48.Visual Grounding
49.Image Geo-localization(图像地理定位)	50.Anomaly Detection(异常检测)	51.光学、几何、光场成像	52.Human Motion Forecasting(人体运动预测)
53.Sign Language Translation(手语翻译)	54.Dataset(数据集)	55.Novel View Synthesis(视图合成)	56.Sound
57.Gaze Estimation(视线估计)	58.Neural rendering(神经渲染)	59.动画	60.Visual Emotion Analysis(视觉情感分析)

Machine Translation(机器翻译)

VALHALLA: Visual Hallucination for Machine Translation
🏠project

Object Counting(目标计数)

Rethinking Spatial Invariance of Convolutional Networks for Object Counting
⭐code📰解读

computer-aided design (CAD)

Neural Face Identification in a 2D Wireframe Projection of a Manifold Object
⭐code

60.Visual Emotion Analysis(视觉情感分析)

MDAN: Multi-level Dependent Attention Network for Visual Emotion Analysis

59.动画

图像动画

Thin-Plate Spline Motion Model for Image Animation
人物动画
- Structured Local Radiance Fields for Human Avatar Modeling
3D character animation(三维角色动画)
- 皮肤预测
  - SkinningNet: Two-Stream Graph Convolutional Neural Network for Skinning Prediction of Synthetic Characters
    🏠project
3D 舞蹈生成
- Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

58.Neural rendering(神经渲染)

57.Gaze Estimation(视线估计)

GazeOnce: Real-Time Multi-Person Gaze Estimation

56.Sound

声源定位
- Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes
  ⭐code

55.Novel View Synthesis(视图合成)

54.Dataset(数据集)

53.Sign Language Translation(手语翻译)

A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation

52.Human Motion Forecasting(人体运动预测)

51.光学、几何、光场成像

Light Field(光场)
- Occlusion-Aware Cost Constructor for Light Field Depth Estimation
  ⭐code📰粗解
深度重建
- Deep Hyperspectral-Depth Reconstruction Using Single Color-Dot Projection
  ⭐code🏠project📺video
快门校正
- Learning Adaptive Warping for Real-World Rolling Shutter Correction
  ⭐code
热红外成像
- Infrared Invisible Clothing:Hiding from Infrared Detectors at Multiple Angles in Real World
  😮oral

50.Anomaly Detection(异常检测)

49.Image Geo-localization(图像地理定位)

TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
⭐code
视觉地理定位
- Rethinking Visual Geo-localization for Large-Scale Applications
  ⭐code
- Deep Visual Geo-localization Benchmark
  😮oral🏠project
轨迹重建
- MonoTrack: Shuttle trajectory reconstruction from monocular badminton video

48.Visual Grounding

Multi-View Transformer for 3D Visual Grounding
⭐code
Improving Visual Grounding with Visual-Linguistic Verification and Iterative Reasoning
⭐code
视觉定位，通过自然语言定位目标位置（很有意思的研究）

47.Few/Zero-Shot Learning/Domain Generalization/Adaptation(小/零样本/域泛化/适应)

小样本
零样本
域泛化
- Compound Domain Generalization via Meta-Knowledge Encoding
- Causality Inspired Representation Learning for Domain Generalization
- Towards Unsupervised Domain Generalization
  本次任务的主要目标是域泛化（domain generalization(DG)），是首篇将DG推广到unsupervised learning 领域的，并提出一个新的研究领域 unsupervised domain generalization(UDG)。
- 域外泛化
  - The Two Dimensions of Worst-case Training and the Integrated Effect for Out-of-domain Generalization
域适应

46.Scene Graph Generation(场景图生成)

HL-Net: Heterophily Learning Network for Scene Graph Generatio
⭐code
场景图生成：异质学习网络
📰解读
RU-Net: Regularized Unrolling Network for Scene Graph Generation
⭐code
场景图生成：正则展开网络
📰解读
The Devil is in the Labels: Noisy Label Correction for Robust Scene Graph Generation
⭐code

45.Dense Prediction(密集预测)

Does Robustness on ImageNet Transfer to Downstream Tasks?

44.Federated Learning(联邦学习)

43.Multi-Task Learning(多任务学习)

42.Metric Learning(度量学习)

Self-Taught Metric Learning without Labels

41.Incremental Learning(增量学习)

40.Adversarial Learning(对抗学习)

39.Continual Learning(持续学习)

38.Meta-Learning(元学习)

37.Contrastive Learning(对比学习)

36.Optical Flow(光流估计)

35.OCR

场景文本检测
- Towards End-to-End Unified Scene Text Detection and Layout Analysis
  ⭐code
- Pushing the Performance Limit of Scene Text Recognizer without Human Annotation
- Vision-Language Pre-Training for Boosting Scene Text Detectors
  视觉语言预训练，场景文本检测,代码将开源，地址尚未公布。
Text Spotting
- Text Spotting Transformers
  ⭐code📰粗解
LOGO设计
- Aesthetic Text Logo Synthesis via Content-aware Layout Inferring
  ⭐code
  📰CVPR 2022 | 北大、腾讯提出文字logo生成模型，脑洞大开堪比设计师
字体生成
- XMP-Font: Self-Supervised Cross-Modality Pre-training for Few-Shot Font Generation
- (Oral)Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator
  字体生成（很有商业价值的方向）
- Few-Shot Font Generation by Learning Fine-Grained Local Styles
文本识别
- Open-set Text Recognition via Character-Context Decoupling
表格结构识别
- Neural Collaborative Graph Machines for Table Structure Recognition
  📰解读

34.Model Compression/Knowledge Distillation/Pruning(模型压缩/知识蒸馏/剪枝)

33.Human-Object Interaction(人物交互)

32.Data Augmentation(数据增强)

31.Vision-Language(视觉语言)

30.Visual Answer Questions(视觉问答)

29.Augmented Reality/Virtual Reality/Robotics(增强/虚拟现实/机器人)

目标导航
- Online Learning of Reusable Abstract Models for Object Goal Navigation
try-on
- Dressing in the Wild by Watching Dance Videos
  🏠project
- Style-Based Global Appearance Flow for Virtual Try-On
  ⭐code
- ClothFormer:Taming Video Virtual Try-on in All Module
  😮oral⭐code🏠project📰解读
AR
- Episodic Memory Question Answering
  😮oral⭐code
  AI助理：情景记忆问答（增强现实新任务，数据及代码均将开源）
机器人
- 手-物姿态估计
  - ArtiBoost: Boosting Articulated 3D Hand-Object Pose Estimation via Online Exploration and Synthesis
    ⭐code
    📰粗解

28.Style Transfer(风格迁移)

Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer
⭐code
Industrial Style Transfer with Large-scale Geometric Warping and Content Preservation
⭐code
运动风格迁移
- Style-ERD: Responsive and Coherent Online Motion Style Transfer
运动迁移
- Structure-Aware Motion Transfer with Deformable Anchor Model
  ⭐code📰解读
场景风格化
- StylizedNeRF: Consistent 3D Scene Stylization as Stylized NeRF via 2D-3D Mutual Learning

27.Pose Estimation(物体姿势估计)

26.GCN/GNN

25.Fine-Grained/Image Classification(细粒度/图像分类)

细粒度分类
- Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information
  ⭐code📰粗解 📓
图像分类
- DTFD-MIL: Double-Tier Feature Distillation Multiple Instance Learning for Histopathology Whole Slide Image Classification
  ⭐code
- Contrastive Test-Time Adaptation
  🏠project
小样本分类
- CAD: Co-Adapting Discriminative Features for Improved Few-Shot Classification
- Matching Feature Sets for Few-Shot Image Classification
  ⭐code🏠project📺video
- Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification
  😮oral⭐code🏠project📰解读
- Learning to Affiliate: Mutual Centralized Learning for Few-shot Classification
  📰解读
- Generating Representative Samples for Few-Shot Classification
  ⭐code
  📰粗解
  在小样本分类问题中，通过生成更多代表性样本，去除非代表性样本，改善了分类结果。实现了SOTA的结果。
- 小样本分类与分割(FS-CS)
  - Integrative Few-Shot Learning for Classification and Segmentation
长尾识别
细粒度识别
- Knowledge Mining with Scene Text for Fine-Grained Recognition
  ⭐code📰解读
多标签分类
- Large Loss Matters in Weakly Supervised Multi-Label Classification
  ⭐code🏠project

24.Super-Resolution(超分辨率)

23.Image Retrieval(图像检索)

22.Image Synthesis/Generation(图像合成)

Interactive Image Synthesis with Panoptic Layout Generation
Autoregressive Image Generation using Residual Quantization
⭐code📰粗解
GIRAFFE HD: A High-Resolution 3D-aware Generative Model
Arbitrary-Scale Image Synthesis
⭐code📰粗解
Multi-View Consistent Generative Adversarial Networks for 3D-aware Image Synthesis
⭐code📰解读
Learning to Memorize Feature Hallucination for One-Shot Image Generation
📰解读
文本引导的图像处理
- ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation
  😮oral🏠project
姿势引导的图像合成
- Exploring Dual-task Correlation for Pose Guided Person Image Generation
  ⭐code📰粗解
文本到图像合成
图像翻译
- FlexIT: Towards Flexible Semantic Image Translation
- A Style-aware Discriminator for Controllable Image Translation
图像生成

21.UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)

遥感图像融合
- HyperTransformer: A Textural and Spectral Feature Fusion Transformer for Pansharpening
  ⭐code📰粗解
航空图像分割
- Revisiting Near/Remote Sensing with Geospatial Attention

20.Autonomous vehicles(自动驾驶)

自动驾驶
车道线检测
- Rethinking Efficient Lane Detection via Curve Modeling
  ⭐code📰粗解
   📓
- Towards Driving-Oriented Metric for Lane Detection Models
- A Keypoint-based Global Association Network for Lane Detection
  ⭐code📰解读
- 单目3D车道检测
  - ONCE-3DLanes: Building Monocular 3D Lane Detection
    ⭐code
    车道线检测技术再演进
车道线描述
- Eigenlanes: Data-Driven Lane Descriptors for Structurally Diverse Lanes
  ⭐code
- CLRNet: Cross Layer Refinement Network for Lane Detection
  📰解读
行为预测
- 🐦️JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection
自动驾驶场景重新照明
- SIMBAR: Single Image-Based Scene Relighting For Effective Data Augmentation For Automated Driving Vision Tasks
  🏠project

19.Neural Architecture Search(神经架构搜索)

🐦️ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior
Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search
⭐code📰解读
GPUNet: Searching the Deployable Convolution Neural Networks for GPUs
神经架构搜索，面向GPUs部署的轻量级网络结构搜索（比谷歌EfficientNet-X系列、Meta FBNetV3 速度更快，甚至性能都要好，作者来自英伟达）

18.Person Re-Identification(人员重识别)

17.Medical Image(医学影像)

Temporal Context Matters: Enhancing Single Image Prediction with Disease Progression Representations
BoostMIS: Boosting Medical Image Semi-supervised Learning with Adaptive Pseudo Labeling and Informative Active Annotation
DeepLIIF: An Online Platform for Quantification of Clinical Pathology Slides
DiRA: Discriminative, Restorative, and Adversarial Learning for Self-supervised Medical Image Analysis
⭐code📰解读
Surpassing the Human Accuracy: Detecting Gallbladder Cancer from USG Images with Curriculum Learning
⭐code🏠project
3D生物打印
- Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers
  利用伤口分割和重建生成3D生物打印贴片来治疗糖尿病足溃疡
SR（ＭRI）
- Transformer-empowered Multi-scale Contextual Matching and Aggregation for Multi-contrast MRI Super-resolution
  ⭐code
医学图像配准
- Affine Medical Image Registration with Coarse-to-Fine Vision Transformer
  ⭐code
医学图像分析
- FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis
  ⭐code📰解读
自动生成报告
- Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation

16.Semi/self-supervised learning(半/自监督)

自监督
半监督
弱监督
- P3IV: Probabilistic Procedure Planning from Instructional Videos with Weak Supervision
  ⭐code
  使用教学视频进行概率性程序规划的弱监督方法

15.Transformer

14.Video

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models
😮oral
动作分割
- Unsupervised Activity Segmentation by Joint Representation Learning and Online Clustering
  📺video
- Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
动作理解
- How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs
- Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos
  ⭐code
Video Copy Detection(视频拷贝检测)
- A Large-scale Comprehensive Dataset and Copy-overlap Aware Evaluation Protocol for Segment-level Video Copy Detection
  ⭐code
视频合成
- Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
  ⭐code
- 3D Moments from Near-Duplicate Photos
  🏠project
视频异常检测
- Generative Cooperative Learning for Unsupervised Video Anomaly Detection
- Bayesian Nonparametric Submodular Video Partition for Robust Anomaly Detection
视频监控
- 轨迹预测
视频时刻检索和视频高光检测
- UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
  ⭐code
- Learning Pixel-Level Distinctions for Video Highlight Detection
视频时刻检索
- AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
视频预测
- STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction
- Continual Predictive Learning from Videos
  😮oral⭐code
视频个体计数
- DR.VIC: Decomposition and Reasoning for Video Individual Counting
  ⭐code
视频插值
- Many-to-many Splatting for Efficient Video Frame Interpolation
  ⭐code
- TimeReplayer: Unlocking the Potential of Event Cameras for Video Interpolation
- Long-term Video Frame Interpolation via Feature Propagation
- Time Lens++: Event-based Frame Interpolation with Parametric Non-linear Flow and Multi-scale Fusion
视觉对应（视频）
- Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning
  ⭐code
视频识别
- BEVT: BERT Pretraining of Video Transformers
  ⭐code
  📰视频Transformer自监督预训练新范式，复旦、微软云AI实现视频识别新SOTA
视频分类
- 零样本视频分类
  - Alignment-Uniformity aware Representation Learning for Zero-shot Video Classification
视频预测
- 手部动作预测
  - Joint Hand Motion and Interaction Hotspots Prediction from Egocentric Videos
    🏠project📺video
视频分割
- Modeling Motion with Multi-Modal Features for Text-Based Video Segmentation
  ⭐code
- VOS
  - Recurrent Dynamic Embedding for Video Object Segmentation
    ⭐code
  - Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
    ⭐code🏠project
- 视频实例分割(VIS)
  - Efficient Video Instance Segmentation via Tracklet Query and Proposal
    🏠project📺video📰粗解
  - Temporally Efficient Vision Transformer for Video Instance Segmentation
    😮oral⭐code📰解读
- 视频语义分割
  - Coarse-to-Fine Feature Mining for Video Semantic Segmentation
    ⭐code
- 视频全景分割
  - Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
    😮oral⭐code📰解读
视频影像处理
- 视频超分辨率
  - Reference-based Video Super-Resolution Using Multi-Camera Video Triplets
  - Learning Trajectory-Aware Transformer for Video Super-Resolution
    😮oral⭐code
  - Investigating Tradeoffs in Real-World Video Super-Resolution
    ⭐code📰解读
  - BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment
    ⭐code🏠project📺video
    🏆NTIRE 2021年视频修复和增强挑战赛冠军
  - Look Back and Forth: Video Super-Resolution with Explicit Temporal Difference Modeling
    📰ETDM：基于显式时间差分建模的视频超分辨率
  - Memory-Augmented Non-Local Attention for Video Super-Resolution
    📰解读
- 视频恢复
  - Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature
    ⭐code
- 视频修复
  - Towards An End-to-End Framework for Flow-Guided Video Inpainting
- 视频去摩尔纹
  - Video Demoireing with Relation-Based Temporal Consistency
    🏠project📺video
- 视频去模糊
  - Multi-Scale Memory-Based Video Deblurring
- 视频去噪
  - Dancing under the stars: video denoising in starlight
    ⭐code
- 电影修复
  - Bringing Old Films Back to Life
    ⭐code
视频表征学习
- TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition
  😮oral⭐code📰解读
- 自监督视频表征学习
- 视频对比学习
  - Probabilistic Representations for Video Contrastive Learning
视频分解
- Deformable Sprites for Unsupervised Video Decomposition
  😮oral🏠project
视频阴影检测
- Video Shadow Detection via Spatio-Temporal Interpolation Consistency Training
  ⭐code
视频帧插值
- IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation
  📰解读
- Video Frame Interpolation with Transformer
  ⭐code
  📰解读
VSS
- Scene Consistency Representation Learning for Video Scene Segmentation
  ⭐code
  📰解读
VSR
- Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video Super-Resolution via Cycle-Projected Mutual Learning
  ⭐code
  📰解读
视频重建
- Context-Aware Video Reconstruction for Rolling Shutter Cameras
  ⭐code📰解读
视频理解
- Revisiting the "Video" in Video-Language Understanding
  😮oral⭐code

13.GAN

🐦️HyperInverter: Improving StyleGAN Inversion via Hypernetwork
🏠project
InsetGAN for Full-Body Image Generation
🏠project
📰1024x1024 分辨率，效果惊人！InsetGAN：全身图像生成
Commonality in Natural Images Rescues GANs: Pretraining GANs with Generic and Privacy-free Synthetic Data
⭐code
Deep Image-based Illumination Harmonization
GAN-Supervised Dense Visual Alignment
😮oral⭐code🏠project📺video
📰CVPR2022 Oral：GAN监督的密集视觉对齐，代码开源
HairMapper: Removing Hair from Portraits Using GANs
⭐code
Polymorphic-GAN: Generating Aligned Samples across Multiple Domains with Learned Morph Maps
😮oral🏠project
图像篡改检测
- Proactive Image Manipulation Detection
  ⭐code
头发编辑
- HairCLIP: Design Your Hair by Text and Reference Image
  ⭐code

12.Image-to-Image Translation(图像到图像翻译)

11.Face(人脸)

Protecting Celebrities with Identity Consistency Transformer
Deepfake
- Voice-Face Homogeneity Tells Deepfake
  ⭐code📰粗解
妆容迁移
- Protecting Facial Privacy: Generating Adversarial Identity Masks via Style-robust Makeup Transfer
人脸识别
- Local-Adaptive Face Recognition via Graph-based Meta-Clustering and Regularized Adaptation
- Killing Two Birds with One Stone:Efficient and Robust Training of Face Recognition CNNs by Partial FC
  ⭐code
- AdaFace: Quality Adaptive Margin for Face Recognition
  😮oral⭐code
人脸表情识别
- Towards Semi-Supervised Deep Facial Expression Recognition with An Adaptive Confidence Margin
  ⭐code
3D人脸
- ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations
- Learning to Restore 3D Face from In-the-Wild Degraded Images
  📰解读
活体检测
- PatchNet: A Simple Face Anti-Spoofing Framework via Fine-Grained Patch Recognition
假脸检测
- Exploring Frequency Adversarial Attacks for Face Forgery Detection
  📰粗解
人脸交换
- High-resolution Face Swapping via Latent Semantics Disentanglement
  ⭐code
人脸属性分类
- Fair Contrastive Learning for Facial Attribute Classification
  ⭐code
Face Relighting(人脸重照光)
- Face Relighting with Geometrically Consistent Shadows
人脸编辑
- TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing
  ⭐code🏠project
人脸幻构
- Escaping Data Scarcity for High-Resolution Heterogeneous Face Hallucination
Deepfake检测
- Detecting Deepfakes with Self-Blended Images
  😮oral⭐code
人脸重建
- JIFF: Jointly-aligned Implicit Face Function for High Quality Single View Clothed Human Reconstruction
  ⭐code🏠project📰解读
人脸捕捉
- EMOCA: Emotion Driven Monocular Face Capture and Animation
  🏠project
换头
- Few-Shot Head Swapping in the Wild
  😮oral⭐code🏠project📺video📰解读
人像畸变矫正
- Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer
  ⭐code📰解读
3D人脸建模
- Physically-guided Disentangled Implicit Rendering for 3D Face Modeling
  📰解读
人脸修复
- Blind Face Restoration via Integrating Face Shape and Generative Priors
  📰解读

10.3D(三维视觉)

9.Human Pose Estimation(人体姿态估计)

COAP: Compositional Articulated Occupancy of People
⭐code🏠project📺video📰解读
Context-Aware Sequence Alignment using 4D Skeletal Augmentation
😮oral⭐code🏠project
多人姿态估计
- Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation
基于视频的HPE
- Temporal Feature Alignment and Mutual Information Maximization for Video-Based Human Pose Estimation
  ::oral:star:code
3D pose
4D 人体捕获
- H4D: Human 4D Modeling by Learning Neural Compositional Representation
手势生成
- Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation
3D手网格估计
- HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network
3D形状生成
- Towards Implicit Text-Guided 3D Shape Generation
- 3D狗的形状
  - BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information
    🏠project
运动捕捉
- Neural MoCon: Neural Motion Control for Physically Plausible Human Motion Capture
  🏠project
手臂-手部动态估计
- Spatial-Temporal Parallel Transformer for Arm-Hand Dynamic Estimation
3D手重建
- LISA: Learning Implicit Shape and Appearance of Hands
  🏠project
3D人体形状
- OSSO: Obtaining Skeletal Shape from Outside
  ⭐code🏠project📺video📰解读
Dense correspondence
- BodyMap: Learning Full-Body Dense Correspondence Map
  🏠project
3D人体运动重建
- Differentiable Dynamics for Articulated 3d Human Motion Reconstruction
三维人体姿态重建
- Trajectory Optimization for Physics-Based Reconstruction of 3d Human Pose from Monocular Video

8.Action Detection(人体动作检测与识别)

动作检测
- Colar: Effective and Efficient Online Action Detection by Consulting Exemplars
- Learnable Irrelevant Modality Dropout for Multimodal Action Recognition on Modality-Specific Annotated Videos
- End-to-End Semi-Supervised Learning for Video Action Detection
- SPAct: Self-supervised Privacy Preservation for Action Recognition
  ⭐code
- Temporal Alignment Networks for Long-term Video
  😮oral⭐code🏠project📰粗解
- SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition
- 零样本动作识别
  - Cross-modal Representation Learning for Zero-shot Action Recognition
    ⭐code
    零样本动作识别：跨模态表示学习
- 小样本动作识别
  - Hybrid Relation Guided Set Matching for Few-shot Action Recognition
    ⭐code📰解读
- 时序动作检测
  - An Empirical Study of End-to-End Temporal Action Detection
    ⭐code📰粗解
时序动作定位
重复动作计数
- TransRAC: Encoding Multi-scale Temporal Correlation with Transformers for Repetitive Action Counting
  😮oral⭐code🏠project
组动作识别
- Dual-AI: Dual-path Action Interaction Learning for Group Activity Recognition
  😮oral
- Detector-Free Weakly Supervised Group Activity Recognition
动作质量评估
- FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
  😮oral⭐code🏠project📰解读

7.Point Cloud(点云)

Shape-invariant 3D Adversarial Point Clouds
⭐code
AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception
REGTR: End-to-end Point Cloud Correspondences with Transformers
⭐code
Equivariant Point Cloud Analysis via Learning Orientations for Message Passing
⭐code
Text2Pos: Text-to-Point-Cloud Cross-Modal Localization
Deformation and Correspondence Aware Unsupervised Synthetic-to-Real Scene Flow Estimation for Point Clouds
⭐code
Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit Neural Representation
⭐code📰解读
3DeformRS: Certifying Spatial Deformations on Point Clouds
⭐code
Reconstructing Surfaces for Sparse Point Clouds with On-Surface Priors
⭐code📰解读
Density-preserving Deep Point Cloud Compression
⭐code🏠project📰解读
Surface Representation for Point Clouds
😮oral⭐code
📰解读
3D 点云
- CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding
  ⭐code📰粗解
  CrossPoint，一个用于 3D 点云表征学习的简单自监督学习框架。虽然该方法是在合成的三维物体数据集上训练的，但在下游任务中的实验结果，如三维物体分类和三维物体部分分割，在合成和真实世界的数据集中都证明了该方法在学习可迁移表征方面的有效性。
- A Unified Query-based Paradigm for Point Cloud Understanding
- WarpingGAN: Warping Multiple Uniform Priors for Adversarial 3D Point Cloud Generation
  ⭐code
- 3D点云分割
  - Stratified Transformer for 3D Point Cloud Segmentation
    ⭐code
点云分类
- ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via Adversarial Rotation
  ⭐code📰粗解 📓
点云配准
- SC^2-PCR: A Second Order Spatial Compatibility for Efficient and Robust Point Cloud Registration
  ⭐code
  📰二阶相似性测度，让传统配准方法取得比深度学习更好的性能，并达到深度学习的速度
点云补全
点云分割
- Contrastive Boundary Learning for Point Cloud Segmentation
  ⭐code📰解读
- SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation
  ⭐code📰解读
场景流估计
- RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

6.Object Tracking(目标跟踪)

TCTrack: Temporal Contexts for Aerial Tracking
⭐code📰粗解
📰TCTrack: 用于空中跟踪的时序信息框架
Correlation-Aware Deep Tracking
Global Tracking Transformers
⭐code
Unified Transformer Tracker for Object Tracking
⭐code
Global Tracking via Ensemble of Local Trackers
Unsupervised Learning of Accurate Siamese Tracking
⭐code
Transformer Tracking with Cyclic Shifting Window Attention
⭐code
Transformer 跟踪：循环为一窗口注意力模型。该算法在五个数据集VOT2020, UAV123, LaSOT, TrackingNet, GOT-10k上均实现了新的SOTA.
Cannot See the Forest for the Trees: Aggregating Multiple Viewpoints to Better Classify Objects in Videos
⭐code
3D 目标跟踪
- Beyond 3D Siamese Tracking: A Motion-Centric Paradigm for 3D Single Object Tracking in Point Clouds
  ⭐code📰粗解
多目标跟踪
- Learning of Global Objective for Network Flow in Multi-Object Tracking
- MeMOT: Multi-Object Tracking with Memory
  😮oral
RGB-T跟踪
- Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline
  🏠project📰解读
视觉跟踪
- Ranking-Based Siamese Visual Tracking
  ⭐code📰解读

5.Object Detection(目标检测)

DN-DETR: Accelerate DETR Training by Introducing Query DeNoising
⭐code📰粗解
Overcoming Catastrophic Forgetting in Incremental Object Detection via Elastic Response Distillation
⭐code
Beyond Bounding Box: Multimodal Knowledge Learning for Object Detection
以往目标检测往往以目标包围框作为标注训练，作者引入语言提示信息，提炼语言知识到目标检测模型中，获得了1.6~2.1%的性能增益。
Dynamic Sparse R-CNN
Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild
⭐code📰粗解
Focal and Global Knowledge Distillation for Detectors
⭐code📰解读
关于目标检测的知识蒸馏工作，只需要30行代码就可以在 anchor-base, anchor-free 的单阶段、两阶段各种检测器上稳定涨点，现在代码已经开源。
Group R-CNN for Weakly Semi-supervised Object Detection with Points
⭐code
📰解读
Real-time Object Detection for Streaming Perception
⭐code📰解读
Ev-TTA: Test-Time Adaptation for Event-Based Object Recognition
Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
⭐code
Optimal Correction Cost for Object Detection Evaluation
Expanding Low-Density Latent Regions for Open-Set Object Detection
⭐code
📰解读
SIOD: Single Instance Annotated Per Category Per Image for Object Detection
📰解读
Task-specific Inconsistency Alignment for Domain Adaptive Object Detection
⭐code
Zero-Query Transfer Attacks on Context-Aware Object Detectors
AdaMixer: A Fast-Converging Query-Based Object Detector
😮oral⭐code
Learning to Detect Mobile Objects from LiDAR Scans Without Labels
⭐code
Forecasting from LiDAR via Future Object Detection
⭐code
Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection
😮oral
Multi-Granularity Alignment Domain Adaptation for Object Detection
Proper Reuse of Image Classification Features Improves Object Detection
⭐code
R(Det)^2: Randomized Decision Routing for Object Detection
Towards Robust Adaptive Object Detection under Noisy Annotations
⭐code
Entropy-based Active Learning for Object Detection with Progressive Diversity Constraint
Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection
Interactive Segmentation and Visualization for Tiny Objects in Multi-megapixel Images
⭐code
Cross Domain Object Detection by Target-Perceived Dual Branch Distillation
⭐code
跨域目标检测：目标感知双分支蒸馏
Progressive End-to-End Object Detection in Crowded Scenes
⭐code
📰解读
HCSC: Hierarchical Contrastive Selective Coding
⭐code
📰CNN自监督预训练新SOTA：上交、Mila、字节联合提出具有层级结构的图像表征自学习新框架
Recurrent Glimpse-based Decoder for Detection with Transformer
😮oral⭐code
📰解读
小样本目标检测
- Sylph: A Hypernetwork Framework for Incremental Few-shot Object Detection
- Few-Shot Object Detection with Fully Cross-Transformer
目标定位
- Weakly Supervised Object Localization as Domain Adaption
  ⭐code📰粗解
- Bridging the Gap between Classification and Localization for Weakly Supervised Object Localization
- Object Localization under Single Coarse Point Supervision
  ⭐code
  📰解读
3D目标检测
- A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation
- Pseudo-Stereo for Monocular 3D Object Detection in Autonomous Driving
  ⭐code📰粗解
- Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task
  🏠project
- Point2Seq: Detecting 3D Objects as Sequences
  ⭐code
- MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection
  ⭐code
- LiDAR Snowfall Simulation for Robust 3D Object Detection
  😮oral⭐code
- CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection
- Homography Loss for Monocular 3D Object Detection
- HyperDet3D: Learning a Scene-conditioned 3D Object Detector
- DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection
  ⭐code
- OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data
  ⭐code
- Focal Sparse Convolutional Networks for 3D Object Detection
  😮oral⭐code📰解读 📓
- Rotationally Equivariant 3D Object Detection
  🏠project
- Bridged Transformer for Vision and Point Cloud 3D Object Detection
  📰解读
- Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion
  📰解读
- VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention
  ⭐code
  📰华南理工提出VISTA：双跨视角空间注意力机制实现3D目标检测SOTA，即插即用
- Diversity Matters: Fully Exploiting Depth Clues for Reliable Monocular 3D Object Detection
  😮oral
伪装目标检测
- Zoom In and Out: A Mixed-scale Triplet Network for Camouflaged Object Detection
  ⭐code
全监督目标检测
- Omni-DETR: Omni-Supervised Object Detection with Transformers
  ⭐code
半监督目标检测
- Dense Learning based Semi-Supervised Object Detection
  ⭐code📰解读
显著目标检测
- Pyramid Grafting Network for One-Stage High Resolution Saliency Detection
  ⭐code📰解读
  📰超高分辨率显著目标检测，新颖高效的错层嫁接架构PGNet（CVPR2022）
- Learning from Pixel-Level Noisy Label : A New Perspective for Light Field Saliency Detection
  ⭐code📰解读
- Bi-directional Object-context Prioritization Learning for Saliency Ranking
  ⭐code
关键点检测
- Self-Supervised Equivariant Learning for Oriented Keypoint Detection
- UKPGAN: A General Self-Supervised Keypoint Detector
  ⭐code
  📰粗解
Affordance grounding
- Learning Affordance Grounding from Exocentric Images
  ⭐code📰解读
图像对齐
- Unsupervised Homography Estimation with Coplanarity-Aware GAN
物体属性识别
- Disentangling Visual Embeddings for Attributes and Objects
  😮oral⭐code

4.Image Captioning(图像字幕)

3.Image Progress(图像处理)

图像恢复
- Attentive Fine-Grained Structured Sparsity for Image Restoration
  ⭐code📰解读
图像修复
图像拼接
- Deep Rectangling for Image Stitching: A Learning Baseline
  😮oral⭐code📰粗解
运动去模糊
- Unifying Motion Deblurring and Frame Interpolation with Events
image outpainting
- Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation
  🏠project
图像美学评估
- Personalized Image Aesthetics Assessment with Rich Attributes
  🏠project
图像质量评估
- Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment
  ⭐code📰解读
图像去雨
- Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond
  ⭐code
图像去模糊
- Learning to Deblur using Light Field Generated and Real Defocus Images
  ⭐code🏠project
图像去噪
- CVF-SID: Cyclic multi-Variate Function for Self-Supervised Image Denoising by Disentangling Noise from Image
  ⭐code
- NAN: Noise-Aware NeRFs for Burst-Denoising
De-rendering
- Learning sRGB-to-Raw-RGB De-rendering with Content-Aware Metadata
  ⭐code📰解读
图像增强
图像和谐化
- SCS-Co: Self-Consistent Style Contrastive Learning for Image Harmonization
  ⭐code
图像超级补全
- Scene Graph Expansion for Semantics-Guided Image Outpainting
  该文解决了一个非常有意思的问题，通过对图像场景图的扩展，对图像边缘以外的内容进行语义引导的内容生成，可帮助设计师快速绘就自然和谐的图像扩展内容。
语义图像匹配
- TransforMatcher: Match-to-Match Attention for Semantic Correspondence
  📰解读

2.Image Segmentation(图像分割)

FocalClick: Towards Practical Interactive Image Segmentation
⭐code📰粗解
Semantic-Aware Domain Generalized Segmentation
😮oral⭐code
ReSTR: Convolution-free Referring Image Segmentation Using Transformers
Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
🏠project
全景神经场：谷歌新提出的语义级目标感知的神经场景表示模型。该表示模型可以有效地用于新视图合成、2D 全景分割、3D 场景编辑和多视图深度预测等多项任务。相信这又会是一个引领潮流的新方向。
实例分割
- E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation
  ⭐code📰粗解
- Sparse Instance Activation for Real-Time Instance Segmentation
  ⭐code
- SharpContour: A Contour-based Boundary Refinement Approach for Efficient and Accurate Instance Segmentation
  🏠project
- Open-World Instance Segmentation: Exploiting Pseudo Ground Truth From Learned Pairwise Affinity
  ⭐code🏠project
- DArch: Dental Arch Prior-assisted 3D Tooth Instance Segmentation
- Relieving Long-tailed Instance Segmentation via Pairwise Class Balance
  ⭐code📰解读
- ContrastMask: Contrastive Learning to Segment Every Thing
  📰解读
  基于像素级对比学习的不完全监督实例分割算法
- 半监督实例分割
  - Noisy Boundaries: Lemon or Lemonade for Semi-supervised Instance Segmentation?
- 3D 实例分割
  - SoftGroup for 3D Instance Segmentation on Point Clouds
    ⭐code📰粗解
- 🐦️FreeSOLO: Learning to Segment Objects without Annotations
语义分割
动作分割
- Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos
场景解析
- FLOAT: Factorized Learning of Object Attributes for Improved Multi-object Multi-part Scene Parsing
  ⭐code
雾景分割
- FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation
  😮oral
全景分割
抠图
- Human Instance Matting via Mutual Guidance and Multi-Instance Refinement
  😮oral⭐code

1.其它

论文尚未公布

相机重定位
- ❌[SceneSqueezer: Learning to Compress Scene for Camera Relocalization]
  😮oral
相机成像
- ❌[Learning to Zoom Inside Camera Imaging Pipeline]
Homography Estimation(旋转估计)
- ❌[Unsupervised Homography Estimation with Coplanarity-Aware GAN]
  ⭐code📰解读
3D人体重建
- ❌[Putting People in their Place: Monocular Regression of 3D People in Depth]
  ⭐code📰解读
图像字幕
- ❌[Comprehending and Ordering Semantics for Image Captioning]
  📰解读
图像去雾
- ❌[Self-augmented Unpaired Image Dehazing via Density and Depth Decomposition]
  📰解读
图像到图像翻译
- ❌[Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint]
  📰解读
光流
- ❌[Learning Optical Flow with Kernel Patch Attention]
  ⭐code📰解读
图像生成
- ❌[Modeling Image Composition for Complex Scene Generation]
  📰解读
连续学习
- ❌[Continual Learning with Lifelong Vision Transformer]
  📰解读
元学习
- ❌[Learning to Learn and Remember Super Long Multi-Domain Task Sequence]
  📰解读
目标检测
- ❌[Voxel Field Fusion for 3D Object Detection]
  📰解读
- ❌[ISNet: Shape Matters for Infrared Small Target Detection]
  📰解读
HOI
- ❌[Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection]
  📰解读
视频建模
- ❌[Stand-Alone Inter-Frame Attention in Video Models]
  📰解读
其他
- ❌[RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging]
  ⭐code
- ❌[Learning to Collaborate in Decentralized Learning of Personalized Models]
  📰解读
- ❌[MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing]
  📰解读
视频场景分割
- ❌[Scene Consistency Representation Learning for Video Scene Segmentation]
  📰解读
图像字幕
- ❌[DIFNet: Boosting Visual Information Flow for Image Captioning]
  📰解读
姿态
- ❌[Location-Free Human Pose Estimation]
  📰解读
小样本
- ❌[Ranking-Guided Distance Calibration for Cross-Domain Few-Shot Learning]
  📰解读
- ❌[En-Compactness: Self-Distillation Embedding & Contrastive Generation for Generalized Zero-Shot Learning]
  📰解读
点云
- ❌[Surface Representation for Point Clouds]
  📰解读
- ❌[Deterministic Point Cloud Registration via Novel Transformation Decomposition]
  📰解读
人脸
- ❌[Evaluation-oriented Knowledge Distillation for Deep Face Recognition]
  📰解读
- ❌[End-to-End Reconstruction-Classification Learning for Face Forgery Detection]
  📰解读
目标检测
- ❌[Thinking Camouflaged Object Detection in Frequency]
  📰解读
对抗
- ❌[Efficent Data-free Model Stealing for Black-box Adversarial Attacks]
  📰解读
分割
- ❌[ISDNet: Integrating Shallow and Deep Networks for Efficient Ultra-high Resolution Segmentation]
  📰解读
- ❌[HybridCR: Weakly-Supervised 3D Point Cloud Semantic Segmentation via Hybrid Contrastive Regularization]
  📰解读
3D场景
- ❌[Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes]
  📰粗解
行人轨迹预测
- ❌[Human Trajectory Prediction with Momentary Observation]
  📰粗解

AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

ID:Cyelie multi-Variate Function for self-supervised image denoising by disentangling noise form image

Diverse Plausible 360-Degree Image Outpainting for Efficient 3DCG Background Creation

来源
[Two Systems in Thinking: Dual-System Transformer for Grounded Situation Recognition]
[Autoregressive Image Generation using Residual Quantization]
✔️Instance-wise Occlusion and Depth Orders in Natural Scenes
[Style Neophile: Constantly Seeking Novel Styles for Domain Generalization]
[ReSTR: Convolution-free Referring Image Segmentation Using Transformers]
[FIFO: Learning Fog-invariant Features for Foggy Scene Segmentation]
[TransforMatcher: Match-to-Match Attention for Semantic Correspondence]
[Reflection and Rotation Symmetry Detection via Equivariant Learning]
[Semi-supervised Semantic Segmentation with Error Localization Network]
[Future Transformer for Long-term Action Anticipation]
[Self-Taught Metric Learning without Labels]
✔️Fast Point Transformer
[Integrative Few-Shot Learning for Classification and Segmentation]
[Scene Painting via Semantic Image Synthesis]
[Detector-Free Weakly Supervised Group Activity Recognition]

52CV/CVPR-2022-Papers

Folders and files

Latest commit

History

Repository files navigation