PaperList

Organize a list of papers 🎈

中国计算机学会推荐国际学术会议和期刊目录-2019 [PDF]

Talking face generation

3D based approaches
- Real-time facial animation with image-based dynamic avatars. Transactions on Graphics, 2016
- paGAN: real-time avatars using dynamic textures. SIGRAPH Asia, 2018. (generates key face expression textures that can be deformed and blended in real-time.)
- Neural Voice Puppetry: Audio-driven Facial Reenactment. [[PDF]](Neural Voice Puppetry Audio-driven Facial Reenactment.pdf) (编辑3DMM的expression base)
- Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion. [PDF]
- Talking-head Generation with Rhythmic Head Motion. [PDF]
2D landmark based approaches
- Hierarchical Cross-Modal Talking Face Generation with Dynamic Pixel-Wise Loss. [PDF] CVPR, 2019.
- MEAD: A Large-scale Audio-visual Dataset for Emotional Talking-face Generation. [PDF][Data] ECCV, 2020.
- Talking Face Generation with Expression-Tailored Generative Adversarial Network. [PDF] ACM Multimedia, 2020. (emotion, identity, speech, 多模态合成)
- Low Bandwidth video-chat compression using deep generative models. [PDF] arxiv, 2020.
- Fast face-swap using convolutional neural networks. ICCV, 2017.
- X2face: A network for controlling face generation using images, audio, and pose codes. [PDF] ECCV, 2018.
- FSGAN: Subject agnostic face swapping and reenactment. [PDF] ICCV, 2019.
- ObamaNet: Photo-realistic lip-sync from text. [PDF] arXiv, 2018.
- Speech-driven Facial Animation using Cascaded GANs for Learning of Motion and Texture. [PDF]
Optical-flow based approaches
- One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing. [[PDF]](./paper/One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing.pdf) arXiv, 2020.
Vid2vid approaches
- Few-shot video-to-video synthesis. NeurIPS, 2019.
- Video-to-video synthesis. NeurIPS, 2018.
- A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild. [PDF] ACM Multimedia 2020. (基于SyncNet和LipGAN，在原视频中抠出脸做变换再拼回去，用于visual translation)
- Towards Automatic Face-to-Face Translation. [PDF] ACM Multimedia 2020. (LipGAN, face2face translation)
- Realistic speech-driven facial animation with gans. [PDF] IJCV 2019.
- CONFIG: Controllable Neural Face Image Generation. [PDF] (AdaIN)
Image based approaches
- Speech Driven Talking Face Generation from a Single Image and an Emotion Condition. [[PDF]](./paper/Speech Driven Talking Face Generation from a Single Image and an Emotion Condition.pdf) arXiv, 2020.
Disentanglement based approaches
- Talking Face Generation by Adversarially Disentangled Audio-Visual Representation [[PDF] ](./paper/Talking Face Generation by Adversarially Disentangled Audio-Visual Representation.pdf)CVPR, 2019.
- Mittal_Animating_Face_using_Disentangled_Audio_Representations. WACV, 2020. (解耦emotion和content)
- APB2FACEV2: Real-time audio-guided multi-face reenactment. [PDF]
- Style Transfer for Co-Speech Gesture Animation: A Multi-Speaker Conditional-Mixture Approach. [PDF]

Lip reading

(SyncNet) Out of Time: Automated Lip Sync in the Wild. [PDF] ACCV, 2016.

Landmark processing

Lifting 2D styleGan for 3D-aware face generation. [PDF] arxiv, 2020.
LandmarkGAN: Synthesizing Faces from Landmarks. [PDF] arXiv, 2020.
Fast bi-layer neural synthesis of oneshot realistic head avatars. ECCV, 2020.
Few-shot adversarial learning of realistic neural talking head models. ICCV, 2019.
Semantic image synthesis with spatially-adaptive normalization. CVPR, 2019.
First order motion model for image animation. NeurIPS, 2019.
face reenactment

Emotion

Video prediction / generation

GAN

(Pix2PixGAN) Image-to-Image Translation with Conditional Adversarial Networks. [PDF] [Github] CVPR, 2017.
(CycleGAN) Unpaired Image-to-Image Translationusing Cycle-Consistent Adversarial Networks. [PDF] [Github] ICCV, 2017.
StyleGAN and its derivative
- (StyleGAN) A Style-Based Generator Architecture for Generative Adversarial Networks. [Which face is real?][PDF][GitHub][中文Blog][YouTube][Bilibili][PyTorch] CVPR, 2019. Nvidia.
- (StyleGAN2) Analyzing and Improving the Image Quality of StyleGAN. [PDF] [GitHub][翻译] CVPR, 2020. Nvidia.
- GAN-Control: Explicitly Controllable GANs. [PDF] arXiv, 2021. Amazon.
- (InterFaceGAN) Interpreting the Latent Space of GANs for Semantic Face Editing. [PDF] [Github] CVPR, 2020.
(BEGAN) BEGAN: Boundary Equilibrium Generative Adversarial Networks. [PDF]

3D Morphable Model

(3DMM)
(3DFFA) Towards Fast, Accurate and Stable 3D Dense Face Alignment. [PDF]
(3DFFA2) Face Alignment in Full Pose Range: A 3D Total Solution. [PDF]

Pretraining

Robust One Shot Audio to Video Generation
HeadGAN: Video-and-Audio-Driven Talking Head Synthesis
Towards Real-World Blind Face Restoration with Generative Facial Prior. [PDF]

(Blind face restoration, 用到pretrained GAN as prior)

Survey

The Creation and Detection of Deepfakes A Survey [PDF]
3D Morphable Face Models—Past, Present, and Future
Transformers in Vision: A Survey
A Survey on Visual Transformer. [[PDF]](./paper/A Survey on Visual Transformer.pdf)
GAN Inversion: A Survey. [PDF]

Extension

VinVL: Making Visual Representations Matter in Vision-Language Models.
Anomaly Detection in Video Sequence with Appearance-Motion Correspondence. [PDf] [Github] ICCV, 2019. (其中用到gradient loss，原因是L2 reconstruction loss会使边缘模糊，而image gradient loss可以锐化细节，gradient loss实现代码在GAN_tf.py的224行。用到了optical flow做motion prediction)
Arbitrary style transfer in real-time with adaptive instance normalization. [PDF] ICCV, 2017. (AdaIN)

Blogs

Web

Jupyter GitHub nbviewer: https://nbviewer.jupyter.org

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
image		image
paper		paper
.DS_Store		.DS_Store
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaperList

Talking face generation

Lip reading

Landmark processing

Emotion

Video prediction / generation

GAN

3D Morphable Model

Pretraining

Survey

Extension

Blogs

Web

About

Releases

Packages

zhaishuyan/PaperList

Folders and files

Latest commit

History

Repository files navigation

PaperList

Talking face generation

Lip reading

Landmark processing

Emotion

Video prediction / generation

GAN

3D Morphable Model

Pretraining

Survey

Extension

Blogs

Web

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages