Adding Adversarial loss and perceptual loss (VGGface) to deepfakes'(reddit user) auto-encoder architecture.
Date | Update |
---|---|
2018-08-27 | Colab support: A colab notebook for faceswap-GAN v2.2 is provided. |
2018-07-25 | Data preparation: Add a new notebook for video pre-processing in which MTCNN is used for face detection as well as face alignment. |
2018-06-29 | Model architecture: faceswap-GAN v2.2 now supports different output resolutions: 64x64, 128x128, and 256x256. Default RESOLUTION = 64 can be changed in the config cell of v2.2 notebook. |
2018-06-25 | New version: faceswap-GAN v2.2 has been released. The main improvements of v2.2 model are its capability of generating realistic and consistent eye movements (results are shown below, or Ctrl+F for eyes), as well as higher video quality with face alignment. |
2018-06-06 | Model architecture: Add a self-attention mechanism proposed in SAGAN into V2 GAN model. (Note: There is still no official code release for SAGAN, the implementation in this repo. could be wrong. We'll keep an eye on it.) |
Here is a playground notebook for faceswap-GAN v2.2 on Google Colab. Users can train their own model in the browser.
-
FaceSwap_GAN_v2.2_train_test.ipynb
- Notebook for model training of faceswap-GAN model version 2.2.
- This notebook also provides code for still image transformation at the bottom.
- Require additional training images generated through prep_binary_masks.ipynb.
-
FaceSwap_GAN_v2.2_video_conversion.ipynb
- Notebook for video conversion of faceswap-GAN model version 2.2.
- Face alignment using 5-points landmarks is introduced to video conversion.
-
- Notebook for training data preprocessing. Output binary masks are save in
./binary_masks/faceA_eyes
and./binary_masks/faceB_eyes
folders. - Require face_alignment package. (An alternative method for generating binary masks (not requiring
face_alignment
anddlib
packages) can be found in MTCNN_video_face_detection_alignment.ipynb.)
- Notebook for training data preprocessing. Output binary masks are save in
-
MTCNN_video_face_detection_alignment.ipynb
- This notebook performs face detection/alignment on the input video.
- Detected faces are saved in
./faces/raw_faces
and./faces/aligned_faces
for non-aligned/aligned results respectively. - Crude eyes binary masks are also generated and saved in
./faces/binary_masks_eyes
. These binary masks can serve as a suboptimal alternative to masks generated through prep_binary_masks.ipynb.
Usage
- Run MTCNN_video_face_detection_alignment.ipynb to extract faces from videos. Manually move/rename the aligned face images into
./faceA/
or./faceB/
folders. - Run prep_binary_masks.ipynb to generate binary masks of training images.
- You can skip this pre-processing step by (1) setting
use_bm_eyes=False
in the config cell of the train_test notebook, or (2) use low-quality binary masks generated in step 1.
- You can skip this pre-processing step by (1) setting
- Run FaceSwap_GAN_v2.2_train_test.ipynb to train models.
- Run FaceSwap_GAN_v2.2_video_conversion.ipynb to create videos using the trained models in step 3.
- faceswap-GAN_colab_demo.ipynb
- An all-in-one notebook for demostration purpose that can be run on Google colab.
- Face images are supposed to be in
./faceA/
or./faceB/
folder for each taeget respectively. - Images will be resized to 256x256 during training.
-
Improved output quality: Adversarial loss improves reconstruction quality of generated images.
-
Additional results: This image shows 160 random results generated by v2 GAN with self-attention mechanism (image format: source -> mask -> transformed).
-
Evaluations: Evaluations of the output quality on Trump/Cage dataset can be found here.
The Trump/Cage images are obtained from the reddit user deepfakes' project on pastebin.com.
-
VGGFace perceptual loss: Perceptual loss improves direction of eyeballs to be more realistic and consistent with input face. It also smoothes out artifacts in the segmentation mask, resulting higher output quality.
-
Attention mask: Model predicts an attention mask that helps on handling occlusion, eliminating artifacts, and producing natrual skin tone.
-
Configurable input/output resolution (v2.2): The model supports 64x64, 128x128, and 256x256 outupt resolutions.
-
Face tracking/alignment using MTCNN and Kalman filter in video conversion:
-
Eyes-aware training: Introduce high reconstruction loss and edge loss in eyes area, which guides the model to generate realistic eyes.
- The following illustration shows a very high-level and abstract (but not exactly the same) flowchart of the denoising autoencoder algorithm. The objective functions look like this.
- Model performs its full potential when the input images are preprocessed with face alignment methods.
- keras 2.1.5
- Tensorflow 1.6.0
- Python 3.6.4
- OpenCV
- keras-vggface
- moviepy
- prefetch_generator (required for v2.2 model)
- face-alignment (required as preprocessing for v2.2 model)
Code borrows from tjwei, eriklindernoren, fchollet, keras-contrib and reddit user deepfakes' project. The generative network is adopted from CycleGAN. Weights and scripts of MTCNN are from FaceNet. Illustrations are from irasutoya.