CVPR 2021 论文和开源项目合集(Papers with Code)

CVPR 2022 论文和开源项目合集(papers with code)！

CVPR 2022 收录列表ID：https://drive.google.com/file/d/15JFhfPboKdUcIH9LdbCMUFmGq_JhaxhC/view

注1：欢迎各位大佬提交issue，分享CVPR 2022论文和开源项目！

注2：关于往年CV顶会论文以及其他优质CV论文和大盘点，详见： https://github.com/amusi/daily-paper-computer-vision

CVPR 2019

CVPR 2020

CVPR 2021

如果你想了解最新最优质的的CV论文、开源项目和学习资料，欢迎扫码加入【CVer学术交流群】！互相学习，一起进步~

【CVPR 2022 论文开源目录】

Backbone
CLIP
NAS
NeRF
Visual Transformer
数据增强(Data Augmentation)
目标检测(Object Detection)
目标跟踪(Visual Tracking)
语义分割(Semantic Segmentation)
实例分割(Instance Segmentation)
图像编辑(Image Editing)
Low-level Vision
超分辨率(Super-Resolution)
3D点云(3D Point Cloud)
3D目标检测(3D Object Detection)
3D人体姿态估计(3D Human Pose Estimation)
3D语义场景补全(3D Semantic Scene Completion)
3D重建(3D Reconstruction)
深度估计(Depth Estimation)
车道线检测(Lane Detection)
图像修复(Image Inpainting)
人群计数(Crowd Counting)
场景图生成(Scene Graph Generation)
水印(Watermarking)
数据集(Datasets)
新任务(New Tasks)
其他(Others)

Backbone

MPViT : Multi-Path Vision Transformer for Dense Prediction

Paper: https://arxiv.org/abs/2112.11010
Code: https://github.com/youngwanLEE/MPViT

CLIP

HairCLIP: Design Your Hair by Text and Reference Image

Paper: https://arxiv.org/abs/2112.05142
Code: https://github.com/wty-ustc/HairCLIP

PointCLIP: Point Cloud Understanding by CLIP

Paper: https://arxiv.org/abs/2112.02413
Code: https://github.com/ZrrSkywalker/PointCLIP

Blended Diffusion for Text-driven Editing of Natural Images

Paper: https://arxiv.org/abs/2111.14818
Code: https://github.com/omriav/blended-diffusion

NAS

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior

Paper: https://arxiv.org/abs/2111.15362
Code: None

NeRF

Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields

Homepage: https://jonbarron.info/mipnerf360/
Paper: https://arxiv.org/abs/2111.12077
Demo: https://youtu.be/YStDS2-Ln1s

Point-NeRF: Point-based Neural Radiance Fields

Homepage: https://xharlie.github.io/projects/project_sites/pointnerf/
Paper: https://arxiv.org/abs/2201.08845
Code: https://github.com/Xharlie/point-nerf

NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images

Paper: https://arxiv.org/abs/2111.13679
Homepage: https://bmild.github.io/rawnerf/
Demo: https://www.youtube.com/watch?v=JtBS4KBcKVc

Visual Transformer

Backbone

MPViT : Multi-Path Vision Transformer for Dense Prediction

Paper: https://arxiv.org/abs/2112.11010
Code: https://github.com/youngwanLEE/MPViT

应用

Language-based Video Editing via Multi-Modal Multi-Level Transformer

Paper: https://arxiv.org/abs/2104.01122
Code: None

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video

Paper: https://arxiv.org/abs/2203.00859
Code: None

Embracing Single Stride 3D Object Detector with Sparse Transformer

Paper: https://arxiv.org/abs/2112.06375
Code: https://github.com/TuSimple/SST

数据增强(Data Augmentation)

TeachAugment: Data Augmentation Optimization Using Teacher Knowledge

Paper: https://arxiv.org/abs/2202.12513
Code: https://github.com/DensoITLab/TeachAugment

AlignMix: Improving representation by interpolating aligned features

Paper: https://arxiv.org/abs/2103.15375
Code: None

目标检测(Object Detection)

DN-DETR: Accelerate DETR Training by Introducing Query DeNoising

Paper: https://arxiv.org/abs/2203.01305
Code: https://github.com/FengLi-ust/DN-DETR

Localization Distillation for Dense Object Detection

Paper: https://arxiv.org/abs/2102.12252
Code: https://github.com/HikariTJU/LD
Code2: https://github.com/HikariTJU/LD
中文解读：https://mp.weixin.qq.com/s/dxss8RjJH283h6IbPCT9vg

目标跟踪(Visual Tracking)

TCTrack: Temporal Contexts for Aerial Tracking

Paper: https://arxiv.org/abs/2203.01885
Code: https://github.com/vision4robotics/TCTrack

语义分割(Semantic Segmentation)

弱监督语义分割

Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation

Paper: https://arxiv.org/abs/2203.00962
Code: https://github.com/zhaozhengChen/ReCAM

半监督语义分割

ST++: Make Self-training Work Better for Semi-supervised Semantic Segmentation

Paper: https://arxiv.org/abs/2106.05095
Code: https://github.com/LiheYoung/ST-PlusPlus

实例分割(Instance Segmentation)

自监督实例分割

FreeSOLO: Learning to Segment Objects without Annotations

Paper: https://arxiv.org/abs/2202.12181
Code: None

视频实例分割

Efficient Video Instance Segmentation via Tracklet Query and Proposal

Homepage: https://jialianwu.com/projects/EfficientVIS.html
Paper: https://arxiv.org/abs/2203.01853
Demo: https://youtu.be/sSPMzgtMKCE

图像编辑(Image Editing)

Blended Diffusion for Text-driven Editing of Natural Images

Paper: https://arxiv.org/abs/2111.14818
Code: https://github.com/omriav/blended-diffusion

Low-level Vision

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior

Paper: https://arxiv.org/abs/2111.15362
Code: None

超分辨率(Super-Resolution)

视频超分辨率

BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

Paper: https://arxiv.org/abs/2104.13371
Code: https://github.com/open-mmlab/mmediting
Code: https://github.com/ckkelvinchan/BasicVSR_PlusPlus

3D点云(3D Point Cloud)

A Unified Query-based Paradigm for Point Cloud Understanding

Paper: https://arxiv.org/abs/2203.01252
Code: None

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding

Paper: https://arxiv.org/abs/2203.00680
Code: https://github.com/MohamedAfham/CrossPoint

PointCLIP: Point Cloud Understanding by CLIP

Paper: https://arxiv.org/abs/2112.02413
Code: https://github.com/ZrrSkywalker/PointCLIP

3D目标检测(3D Object Detection)

Embracing Single Stride 3D Object Detector with Sparse Transformer

Paper: https://arxiv.org/abs/2112.06375
Code: https://github.com/TuSimple/SST

Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes

Paper: https://arxiv.org/abs/2011.12001
Code: https://github.com/qq456cvb/CanonicalVoting

3D人体姿态估计(3D Human Pose Estimation)

MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video

Paper: https://arxiv.org/abs/2203.00859
Code: None

3D语义场景补全(3D Semantic Scene Completion)

MonoScene: Monocular 3D Semantic Scene Completion

Paper: https://arxiv.org/abs/2112.00726
Code: https://github.com/cv-rits/MonoScene

3D重建(3D Reconstruction)

BANMo: Building Animatable 3D Neural Models from Many Casual Videos

Homepage: https://banmo-www.github.io/
Paper: https://arxiv.org/abs/2112.12761
Code: https://github.com/facebookresearch/banmo

深度估计(Depth Estimation)

单目深度估计

NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation

Paper: https://arxiv.org/abs/2203.01502
Code: None

OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion

Paper: https://arxiv.org/abs/2203.00838
Code: None

Toward Practical Self-Supervised Monocular Indoor Depth Estimation

Paper: https://arxiv.org/abs/2112.02306
Code: None

车道线检测(Lane Detection)

Rethinking Efficient Lane Detection via Curve Modeling

图像修复(Image Inpainting)

Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding

Paper: https://arxiv.org/abs/2203.00867
Code: https://github.com/DQiaole/ZITS_inpainting

人群计数(Crowd Counting)

Leveraging Self-Supervision for Cross-Domain Crowd Counting

Paper: https://arxiv.org/abs/2103.16291
Code: None

场景图生成(Scene Graph Generation)

SGTR: End-to-end Scene Graph Generation with Transformer

Paper: https://arxiv.org/abs/2112.12970
Code: None

风格迁移(Style Transfer)

StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions

Homepage: https://lukashoel.github.io/stylemesh/
Paper: https://arxiv.org/abs/2112.01530
Code: https://github.com/lukasHoel/stylemesh
Demo：https://www.youtube.com/watch?v=ZqgiTLcNcks

水印(Watermarking)

Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings

Paper: https://arxiv.org/abs/2104.13450
Code: None

数据集(Datasets)

It's About Time: Analog Clock Reading in the Wild

Homepage: https://charigyang.github.io/abouttime/
Paper: https://arxiv.org/abs/2111.09162
Code: https://github.com/charigyang/itsabouttime
Demo: https://youtu.be/cbiMACA6dRc

Toward Practical Self-Supervised Monocular Indoor Depth Estimation

Paper: https://arxiv.org/abs/2112.02306
Code: None

Kubric: A scalable dataset generator

Paper: https://arxiv.org/abs/2203.03570
Code: https://github.com/google-research/kubric

新任务(New Task)

Language-based Video Editing via Multi-Modal Multi-Level Transformer

Paper: https://arxiv.org/abs/2104.01122
Code: None

It's About Time: Analog Clock Reading in the Wild

Homepage: https://charigyang.github.io/abouttime/
Paper: https://arxiv.org/abs/2111.09162
Code: https://github.com/charigyang/itsabouttime
Demo: https://youtu.be/cbiMACA6dRc

其他(Others)

Kubric: A scalable dataset generator

Paper: https://arxiv.org/abs/2203.03570
Code: https://github.com/google-research/kubric

Name		Name	Last commit message	Last commit date
Latest commit History 541 Commits
CVPR2019-Papers-with-Code.md		CVPR2019-Papers-with-Code.md
CVPR2020-Papers-with-Code.md		CVPR2020-Papers-with-Code.md
CVPR2021-Papers-with-Code.md		CVPR2021-Papers-with-Code.md
CVer学术交流群.png		CVer学术交流群.png
README.md		README.md

BloodBlossom/CVPR2022-Papers-with-Code

Folders and files

Latest commit

History

Repository files navigation

CVPR 2021 论文和开源项目合集(Papers with Code)

【CVPR 2022 论文开源目录】

Backbone

CLIP

NAS

NeRF

Visual Transformer

Backbone

应用

数据增强(Data Augmentation)

目标检测(Object Detection)

目标跟踪(Visual Tracking)

语义分割(Semantic Segmentation)

弱监督语义分割

半监督语义分割

实例分割(Instance Segmentation)

自监督实例分割

视频实例分割

图像编辑(Image Editing)

Low-level Vision

超分辨率(Super-Resolution)

视频超分辨率

3D点云(3D Point Cloud)

3D目标检测(3D Object Detection)

3D人体姿态估计(3D Human Pose Estimation)

3D语义场景补全(3D Semantic Scene Completion)

3D重建(3D Reconstruction)

深度估计(Depth Estimation)

单目深度估计

车道线检测(Lane Detection)

图像修复(Image Inpainting)

人群计数(Crowd Counting)

场景图生成(Scene Graph Generation)

风格迁移(Style Transfer)

水印(Watermarking)

数据集(Datasets)

新任务(New Task)

其他(Others)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages