Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
yxgeee committed Oct 9, 2018
0 parents commit a1ca28a
Show file tree
Hide file tree
Showing 37 changed files with 2,654 additions and 0 deletions.
15 changes: 15 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
.DS_Store
datasets/
checkpoints/
scripts/
torch.egg-info/
*/**/__pycache__
*/*.pyc
*/**/*.pyc
*/**/**/*.pyc
*/**/**/**/*.pyc
*/**/**/**/**/*.pyc
*/*.so*
*/**/*.so*
*/**/*.dylib*
*~
120 changes: 120 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
![Python 3](https://img.shields.io/badge/python-3-green.svg) ![Pytorch 0.3](https://img.shields.io/badge/pytorch-0.3-blue.svg)
# FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification

<p align="center"><img src='framework.jpg' width="600px"></p>

[[Paper]](https://arxiv.org/abs/1810.02936)

[Yixiao Ge](mailto:[email protected])\*, [Zhuowan Li](mailto:[email protected])\*, [Haiyu Zhao](mailto:[email protected]), [Guojun Yin](mailto:[email protected]), [Shuai Yi](mailto:[email protected]), [Xiaogang Wang](mailto:[email protected]), and [Hongsheng Li](mailto:[email protected])
Neural Information Processing Systems (**NIPS**), 2018 (* equal contribution)

Pytorch implementation for our NIPS 2018 work. With the proposed siamese structure, we are able to learn **identity-related** and **pose-unrelated** representations.

## Prerequisites
- Python 3
- [Pytorch](https://pytorch.org/) (We run the code under version 0.3.1, maybe lower versions also work.)

## Getting Started

### Installation
- Install dependencies (e.g., [visdom](https://github.com/facebookresearch/visdom) and [dominate](https://github.com/Knio/dominate)). You can install all the dependencies by:
```
pip install scipy, pillow, torchvision, sklearn, h5py, dominate, visdom
```
- Clone this repo:
```
git clone https://github.com/yxgeee/FD-GAN
cd FD-GAN/
```

### Datasets
We conduct experiments on [Market1501](https://drive.google.com/open?id=1LS5_bMqv-37F14FVuziK63gz0wPyb0Hh), [DukeMTMC](https://drive.google.com/open?id=1Ujtm-Cq7lpyslBkG-rSBjkP1KVntrgSL), [CUHK03](https://drive.google.com/open?id=1R7oCwyMHYIxpRVsYm7-2REmFopP9TSXL) datasets. We need pose landmarks for each dataset during training, so we generate the pose files by [Realtime Multi-Person Pose Estimation](https://github.com/tensorboy/pytorch_Realtime_Multi-Person_Pose_Estimation). And the raw datasets have been preprocessed by the code in [open-reid](https://github.com/Cysu/open-reid).
Download the prepared datasets following below steps:
- Create directories for datasets:
```
mkdir datasets
cd datasets/
```
- Download these datasets through the links above, and `unzip` them in the same root path.

## Usage
As mentioned in the original [paper](), there are three stages for training our proposed framework.

### Stage I: reID baseline pretraining
We use a Siamese baseline structure based on `ResNet-50`. You can train the model with follow commands,
```
python baseline.py -b 256 -j 4 -d market1501 -a resnet50 --combine-trainval \
--lr 0.01 --epochs 100 --step-size 40 --eval-step 5 \
--logs-dir /path/to/save/checkpoints/
```
You can train it on specified GPUs by setting `CUDA_VISIBLE_DEVICES`, and change the dataset name `[market1501|dukemtmc|cuhk03]` after `-d` to train models on different datasets.
Or you can download the pretrained baseline model directly following the link below,
- [Market1501_baseline_model](https://drive.google.com/open?id=1oNLf-gazgfN0EqkdIOKtcJSBx22BuO1-)
- [DukeMTMC_baseline_model](https://drive.google.com/open?id=1iVXIaXT6WQzKuLD3eDcBZB-3aNeZ6Ivf)
- [CUHK03_baseline_model](https://drive.google.com/open?id=1jubhvKl_Ny9b89wbX0-u2GhPEeXMLaUQ)

<a name="stageI"></a>And **test** them with follow commands,
```
python baseline.py -b 256 -d market1501 -a resnet50 --evaluate --resume /path/of/model_best.pth.tar
```

### Stage II: FD-GAN pretraining
We need to pretain FD-GAN with the image encoder part (*E* in the original paper and *net_E* in the code) fixed first. You can train the model with follow commands,
```
python train.py --display-port 6006 --display-id 1 \
--stage 1 -d market1501 --name /directory/name/of/saving/checkpoints/ \
--pose-aug gauss -b 256 -j 4 --niter 50 --niter-decay 50 --lr 0.001 --save-step 10 \
--lambda-recon 100.0 --lambda-veri 0.0 --lambda-sp 10.0 --smooth-label \
--netE-pretrain /path/of/model_best.pth.tar
```
You can train it on specified GPUs by setting `CUDA_VISIBLE_DEVICES`. For main arguments,
- `--display-port`: display port of [visdom](https://github.com/facebookresearch/visdom), e.g., you can visualize the results by `localhost:6006`.
- `--display-id`: set `0` to disable [visdom](https://github.com/facebookresearch/visdom).
- `--stage`: set `1` for Stage II, and set `2` for stage III.
- `--pose-aug`: choose from `[no|erase|gauss]` to make augmentations on pose maps.
- `--smooth-label`: smooth the label of GANloss or not.

Other arguments can be viewed in [options.py](https://github.com/yxgeee/FD-GAN/blob/master/fdgan/options.py).
Also you can directly download the models for stage II,
- [Market1501_stageII_model](https://drive.google.com/open?id=1kIBuPzz-Ig70dE3rU-5-kyo3nGJP01NS)
- [DukeMTMC_stageII_model](https://drive.google.com/open?id=1dD1cbg2jo5qhPbkMbsRYACRcVMrm28-o)
- [CUHK03_stageII_model](https://drive.google.com/open?id=1552oDot-vgA27b-mCspJAuzaOl685koz)

There are four models in each directory for separate nets.

### Stage III: Global finetuning
Finetune the whole framework by optimizing all parts. You can train the model with follow commands,
```
python train.py --display-port 6006 --display-id 1 \
--stage 2 -d market1501 --name /directory/name/of/saving/checkpoints/ \
--pose-aug gauss -b 256 -j 4 --niter 25 --niter-decay 25 --lr 0.0001 --save-step 10 --eval-step 5 \
--lambda-recon 100.0 --lambda-veri 10.0 --lambda-sp 10.0 --smooth-label \
--netE-pretrain /path/of/100_net_E.pth --netG-pretrain /path/of/100_net_G.pth \
--netDi-pretrain /path/of/100_net_Di.pth --netDp-pretrain /path/of/100_net_Dp.pth
```
You can train it on specified GPUs by setting `CUDA_VISIBLE_DEVICES`.
We trained this model on a setting of batchsize 256. If you don't have such or better hardware, you may decrease the batchsize (the performance may also drop).
Or you can directly download our final model,
- [Market1501_stageIII_model](https://drive.google.com/open?id=1w8xqopW0icA3VIxZyelI9k-Fb8rRCME7)
- [DukeMTMC_stageIII_model](https://drive.google.com/open?id=1axBHUcI7JmPbw8Y_mSpMKWIY9FUfFKMI)
- [CUHK03_stageIII_model](https://drive.google.com/open?id=1q6HkDlDUIV9YNUwAggy-HI9zYQjt7Ihk)

And **test** `best_net_E.pth` by the same way as mentioned in [Stage I](#stageI).

## TODO
- scripts for generate pose landmarks.
- generate specified images.

## Citation
Please cite our paper if you find the code useful for your research.
```
@inproceedings{ge2018fdgan,
title={FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification},
author={Ge, Yixiao and Li, Zhuowan and Zhao, Haiyu and Yin, Guojun and Wang, Xiaogang and Li, Hongsheng},
booktitle={Advances in Neural Information Processing Systems},
year={2018}
}
```

## Acknowledgements
Our code is inspired by [pytorch-CycleGAN-and-pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) and [open-reid](https://github.com/Cysu/open-reid).
200 changes: 200 additions & 0 deletions baseline.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
from __future__ import print_function, absolute_import
import argparse
import os.path as osp

import numpy as np
import os, sys
from bisect import bisect_right
import torch
from torch import nn
from torch.nn import functional as F
from torch.autograd import Variable
from torch.backends import cudnn
from torch.utils.data import DataLoader

from reid import datasets
from reid import models
from reid.utils.data import transforms as T
from reid.utils.data.preprocessor import Preprocessor
from reid.utils.logging import Logger
from reid.utils.serialization import load_checkpoint, save_checkpoint, copy_state_dict

from reid.utils.data.sampler import RandomPairSampler
from reid.models.embedding import EltwiseSubEmbed
from reid.models.multi_branch import SiameseNet
from reid.evaluators import CascadeEvaluator
from reid.trainers import SiameseTrainer

def get_data(name, split_id, data_dir, height, width, batch_size, workers,
combine_trainval, np_ratio):
root = osp.join(data_dir, name)

dataset = datasets.create(name, root, split_id=split_id)

normalizer = T.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

train_set = dataset.trainval if combine_trainval else dataset.train

train_transformer = T.Compose([
T.RandomSizedRectCrop(height, width),
T.RandomSizedEarser(),
T.RandomHorizontalFlip(),
T.ToTensor(),
normalizer,
])

test_transformer = T.Compose([
T.RectScale(height, width),
T.ToTensor(),
normalizer,
])

train_loader = DataLoader(
Preprocessor(train_set, root=dataset.images_dir,
transform=train_transformer),
sampler=RandomPairSampler(train_set, neg_pos_ratio=np_ratio),
batch_size=batch_size, num_workers=workers, pin_memory=False)

val_loader = DataLoader(
Preprocessor(dataset.val, root=dataset.images_dir,
transform=test_transformer),
batch_size=batch_size, num_workers=workers,
shuffle=False, pin_memory=False)

test_loader = DataLoader(
Preprocessor(list(set(dataset.query) | set(dataset.gallery)),
root=dataset.images_dir, transform=test_transformer),
batch_size=batch_size, num_workers=workers,
shuffle=False, pin_memory=False)

return dataset, train_loader, val_loader, test_loader


def main(args):
np.random.seed(args.seed)
torch.manual_seed(args.seed)
cudnn.benchmark = True

# Redirect print to both console and log file
if not args.evaluate:
sys.stdout = Logger(osp.join(args.logs_dir, 'log.txt'))
else:
log_dir = osp.dirname(args.resume)
sys.stdout = Logger(osp.join(log_dir, 'log_test.txt'))
# print("==========\nArgs:{}\n==========".format(args))

# Create data loaders
if args.height is None or args.width is None:
args.height, args.width = (256, 128)
dataset, train_loader, val_loader, test_loader = \
get_data(args.dataset, args.split, args.data_dir, args.height,
args.width, args.batch_size, args.workers,
args.combine_trainval, args.np_ratio)

# Create model
base_model = models.create(args.arch, cut_at_pooling=True)
embed_model = EltwiseSubEmbed(use_batch_norm=True, use_classifier=True,
num_features=2048, num_classes=2)
model = SiameseNet(base_model, embed_model)
model = nn.DataParallel(model).cuda()

# Evaluator
evaluator = CascadeEvaluator(
torch.nn.DataParallel(base_model).cuda(),
embed_model,
embed_dist_fn=lambda x: F.softmax(Variable(x), dim=1).data[:, 0])

# Load from checkpoint
best_mAP = 0
if args.resume:
checkpoint = load_checkpoint(args.resume)
if 'state_dict' in checkpoint.keys():
checkpoint = checkpoint['state_dict']
model.load_state_dict(checkpoint)

print("Test the loaded model:")
top1, mAP = evaluator.evaluate(test_loader, dataset.query, dataset.gallery, rerank_topk=100, dataset=args.dataset)
best_mAP = mAP

if args.evaluate:
return

# Criterion
criterion = nn.CrossEntropyLoss().cuda()
# Optimizer
param_groups = [
{'params': model.module.base_model.parameters(), 'lr_mult': 1.0},
{'params': model.module.embed_model.parameters(), 'lr_mult': 10.0}]
optimizer = torch.optim.SGD(param_groups, args.lr, momentum=args.momentum,
weight_decay=args.weight_decay)
# Trainer
trainer = SiameseTrainer(model, criterion)

# Schedule learning rate
def adjust_lr(epoch):
lr = args.lr * (0.1 ** (epoch // args.step_size))
for g in optimizer.param_groups:
g['lr'] = lr * g.get('lr_mult', 1)

# Start training
for epoch in range(0, args.epochs):
adjust_lr(epoch)
trainer.train(epoch, train_loader, optimizer, base_lr=args.lr)

if epoch % args.eval_step==0:
mAP = evaluator.evaluate(val_loader, dataset.val, dataset.val, top1=False)
is_best = mAP > best_mAP
best_mAP = max(mAP, best_mAP)
save_checkpoint({
'state_dict': model.state_dict()
}, is_best, fpath=osp.join(args.logs_dir, 'checkpoint.pth.tar'))

print('\n * Finished epoch {:3d} mAP: {:5.1%} best: {:5.1%}{}\n'.
format(epoch, mAP, best_mAP, ' *' if is_best else ''))

# Final test
print('Test with best model:')
checkpoint = load_checkpoint(osp.join(args.logs_dir, 'model_best.pth.tar'))
model.load_state_dict(checkpoint['state_dict'])
evaluator.evaluate(test_loader, dataset.query, dataset.gallery, dataset=args.dataset)


if __name__ == '__main__':
parser = argparse.ArgumentParser(description="Siamese reID baseline")
# data
parser.add_argument('-d', '--dataset', type=str, default='market1501',
choices=datasets.names())
parser.add_argument('-b', '--batch-size', type=int, default=256)
parser.add_argument('-j', '--workers', type=int, default=4)
parser.add_argument('--split', type=int, default=0)
parser.add_argument('--height', type=int,
help="input height, default: 256 for resnet")
parser.add_argument('--width', type=int,
help="input width, default: 128 for resnet")
parser.add_argument('--combine-trainval', action='store_true',
help="train and val sets together for training, "
"val set alone for validation")
# model
parser.add_argument('-a', '--arch', type=str, default='resnet50',
choices=models.names())
# optimizer
parser.add_argument('--lr', type=float, default=0.01, help="learning rate")
parser.add_argument('--np-ratio', type=int, default=3)
parser.add_argument('--momentum', type=float, default=0.9)
parser.add_argument('--weight-decay', type=float, default=5e-4)
parser.add_argument('--step-size', type=int, default=40)
# training configs
parser.add_argument('--resume', type=str, default='', metavar='PATH')
parser.add_argument('--evaluate', action='store_true',
help="evaluation only")
parser.add_argument('--epochs', type=int, default=50)
parser.add_argument('--eval-step', type=int, default=20, help="evaluation step")
parser.add_argument('--seed', type=int, default=1)
# misc
working_dir = osp.dirname(osp.abspath(__file__))
parser.add_argument('--data-dir', type=str, metavar='PATH',
default=osp.join(working_dir, 'datasets'))
parser.add_argument('--logs-dir', type=str, metavar='PATH',
default=osp.join(working_dir, 'checkpoints'))
main(parser.parse_args())
Empty file added fdgan/__init__.py
Empty file.
32 changes: 32 additions & 0 deletions fdgan/losses.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from __future__ import absolute_import
import os, sys
import functools
import random

import torch
import torch.nn as nn
from torch.autograd import Variable
from torch.nn import functional as F
from torch.nn import init

class GANLoss(nn.Module):
def __init__(self, smooth=False):
super(GANLoss, self).__init__()
self.smooth = smooth

def get_target_tensor(self, input, target_is_real):
real_label = 1.0
fake_label = 0.0
if self.smooth:
real_label = random.uniform(0.7,1.0)
fake_label = random.uniform(0.0,0.3)
if target_is_real:
target_tensor = torch.ones_like(input).fill_(real_label)
else:
target_tensor = torch.zeros_like(input).fill_(fake_label)
return target_tensor

def __call__(self, input, target_is_real):
target_tensor = self.get_target_tensor(input, target_is_real)
input = F.sigmoid(input)
return F.binary_cross_entropy(input, target_tensor)
Loading

0 comments on commit a1ca28a

Please sign in to comment.