diff --git a/README.md b/README.md index d738a80..a7389f4 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,7 @@ This is a Keras implementation of a CNN for estimating age and gender from a face image [1, 2]. In training, [the IMDB-WIKI dataset](https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/) is used. +- [Jul. 5, 2018] The UTKFace dataset became available for training. - [Apr. 10, 2018] Evaluation result on the APPA-REAL dataset was added. ## Dependencies @@ -28,18 +29,18 @@ python3 demo.py The pretrained model for TensorFlow backend will be automatically downloaded to the `pretrained_models` directory. -### Train a model using the IMDB-WIKI dataset - -#### Download the dataset -The dataset is downloaded and extracted to the `data` directory. +### Create training data from the IMDB-WIKI dataset +First, download the dataset. +The dataset is downloaded and extracted to the `data` directory by: ```sh ./download.sh ``` -#### Create training data -Filter out noise data and serialize images and labels for training into `.mat` file. +Secondly, filter out noise data and serialize images and labels for training into `.mat` file. Please check [check_dataset.ipynb](check_dataset.ipynb) for the details of the dataset. +The training data is created by: + ```sh python3 create_db.py --output data/imdb_db.mat --db imdb --img_size 64 ``` @@ -57,7 +58,28 @@ optional arguments: --min_score MIN_SCORE minimum face_score (default: 1.0) ``` -#### Train network +### Create training data from the UTKFace dataset +Firstly, download images from [the website of the UTKFace dataset](https://susanqq.github.io/UTKFace/). +`UTKFace.tar.gz` can be downloaded from `Aligned&Cropped Faces` in Datasets section. +Then, extract the archive. + +```sh +tar zxf UTKFace.tar.gz UTKFace +``` + +Finally, run the following script to create the training data: + +``` +python3 create_db_utkface.py -i UTKFace -o UTKFace.mat +``` + +[NOTE]: Because the face images in the UTKFace dataset is tightly cropped (there is no margin around the face region), +faces should be cropped in `demo.py`. +As tight cropping is currently not supported, please modify the code. + + + +### Train network Train the network using the training data created above. ```sh @@ -89,7 +111,7 @@ optional arguments: --aug use data augmentation if set true (default: False) ``` -#### Train network with recent data augmentation methods +### Train network with recent data augmentation methods Recent data augmentation methods, mixup [3] and Random Erasing [4], can be used with standard data augmentation by `--aug` option in training: @@ -103,7 +125,7 @@ I confirmed that data augmentation enables us to avoid overfitting and improves validation loss. -#### Use the trained network +### Use the trained network ```sh python3 demo.py @@ -125,18 +147,18 @@ optional arguments: Please use the best model among `checkpoints/weights.*.hdf5` for `WEIGHT_FILE` if you use your own trained models. -#### Plot training curves from history file +### Plot training curves from history file ```sh python3 plot_history.py --input models/history_16_8.h5 ``` -##### Results without data augmentation +#### Results without data augmentation -##### Results with data augmentation +#### Results with data augmentation The best val_loss was improved from 3.969 to 3.731: - Without data augmentation: 3.969 - With standard data augmentation: 3.799 @@ -147,14 +169,14 @@ The best val_loss was improved from 3.969 to 3.731: We can see that, with data augmentation, overfitting did not occur even at very small learning rates (epoch > 15). -#### Network architecture +### Network architecture In [the original paper](https://www.vision.ee.ethz.ch/en/publications/papers/articles/eth_biwi_01299.pdf) [1, 2], the pretrained VGG network is adopted. Here the Wide Residual Network (WideResNet) is trained from scratch. I modified the @asmith26's implementation of the WideResNet; two classification layers (for age and gender estimation) are added on the top of the WideResNet. Note that while age and gender are independently estimated by different two CNNs in [1, 2], in my implementation, they are simultaneously estimated using a single CNN. -#### Estimated results +### Estimated results Trained on imdb, tested on wiki. ![](https://github.com/yu4u/age-gender-estimation/wiki/images/result.png) diff --git a/create_db_utkface.py b/create_db_utkface.py new file mode 100644 index 0000000..97de8d1 --- /dev/null +++ b/create_db_utkface.py @@ -0,0 +1,46 @@ +import argparse +from pathlib import Path +from tqdm import tqdm +import numpy as np +import scipy.io +import cv2 + + +def get_args(): + parser = argparse.ArgumentParser(description="This script creates database for training from the UTKFace dataset.", + formatter_class=argparse.ArgumentDefaultsHelpFormatter) + parser.add_argument("--input", "-i", type=str, required=True, + help="path to the UTKFace image directory") + parser.add_argument("--output", "-o", type=str, required=True, + help="path to output database mat file") + parser.add_argument("--img_size", type=int, default=64, + help="output image size") + args = parser.parse_args() + return args + + +def main(): + args = get_args() + image_dir = Path(args.input) + output_path = args.output + img_size = args.img_size + + out_genders = [] + out_ages = [] + out_imgs = [] + + for i, image_path in enumerate(tqdm(image_dir.glob("*.jpg"))): + image_name = image_path.name # [age]_[gender]_[race]_[date&time].jpg + age, gender = image_name.split("_")[:2] + out_genders.append(int(gender)) + out_ages.append(min(int(age), 100)) + img = cv2.imread(str(image_path)) + out_imgs.append(cv2.resize(img, (img_size, img_size))) + + output = {"image": np.array(out_imgs), "gender": np.array(out_genders), "age": np.array(out_ages), + "db": "utk", "img_size": img_size, "min_score": -1} + scipy.io.savemat(output_path, output) + + +if __name__ == '__main__': + main()