Skip to content


initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
yxgeee committed Oct 9, 2018
0 parents commit a1ca28a
Show file tree
Hide file tree
Showing 37 changed files with 2,654 additions and 0 deletions.
15 changes: 15 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
120 changes: 120 additions & 0 deletions
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
![Python 3]( ![Pytorch 0.3](
# FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification

<p align="center"><img src='framework.jpg' width="600px"></p>


[Yixiao Ge](mailto:[email protected])\*, [Zhuowan Li](mailto:[email protected])\*, [Haiyu Zhao](mailto:[email protected]), [Guojun Yin](mailto:[email protected]), [Shuai Yi](mailto:[email protected]), [Xiaogang Wang](mailto:[email protected]), and [Hongsheng Li](mailto:[email protected])
Neural Information Processing Systems (**NIPS**), 2018 (* equal contribution)

Pytorch implementation for our NIPS 2018 work. With the proposed siamese structure, we are able to learn **identity-related** and **pose-unrelated** representations.

## Prerequisites
- Python 3
- [Pytorch]( (We run the code under version 0.3.1, maybe lower versions also work.)

## Getting Started

### Installation
- Install dependencies (e.g., [visdom]( and [dominate]( You can install all the dependencies by:
pip install scipy, pillow, torchvision, sklearn, h5py, dominate, visdom
- Clone this repo:
git clone
cd FD-GAN/

### Datasets
We conduct experiments on [Market1501](, [DukeMTMC](, [CUHK03]( datasets. We need pose landmarks for each dataset during training, so we generate the pose files by [Realtime Multi-Person Pose Estimation]( And the raw datasets have been preprocessed by the code in [open-reid](
Download the prepared datasets following below steps:
- Create directories for datasets:
mkdir datasets
cd datasets/
- Download these datasets through the links above, and `unzip` them in the same root path.

## Usage
As mentioned in the original [paper](), there are three stages for training our proposed framework.

### Stage I: reID baseline pretraining
We use a Siamese baseline structure based on `ResNet-50`. You can train the model with follow commands,
python -b 256 -j 4 -d market1501 -a resnet50 --combine-trainval \
--lr 0.01 --epochs 100 --step-size 40 --eval-step 5 \
--logs-dir /path/to/save/checkpoints/
You can train it on specified GPUs by setting `CUDA_VISIBLE_DEVICES`, and change the dataset name `[market1501|dukemtmc|cuhk03]` after `-d` to train models on different datasets.
Or you can download the pretrained baseline model directly following the link below,
- [Market1501_baseline_model](
- [DukeMTMC_baseline_model](
- [CUHK03_baseline_model](

<a name="stageI"></a>And **test** them with follow commands,
python -b 256 -d market1501 -a resnet50 --evaluate --resume /path/of/model_best.pth.tar

### Stage II: FD-GAN pretraining
We need to pretain FD-GAN with the image encoder part (*E* in the original paper and *net_E* in the code) fixed first. You can train the model with follow commands,
python --display-port 6006 --display-id 1 \
--stage 1 -d market1501 --name /directory/name/of/saving/checkpoints/ \
--pose-aug gauss -b 256 -j 4 --niter 50 --niter-decay 50 --lr 0.001 --save-step 10 \
--lambda-recon 100.0 --lambda-veri 0.0 --lambda-sp 10.0 --smooth-label \
--netE-pretrain /path/of/model_best.pth.tar
You can train it on specified GPUs by setting `CUDA_VISIBLE_DEVICES`. For main arguments,
- `--display-port`: display port of [visdom](, e.g., you can visualize the results by `localhost:6006`.
- `--display-id`: set `0` to disable [visdom](
- `--stage`: set `1` for Stage II, and set `2` for stage III.
- `--pose-aug`: choose from `[no|erase|gauss]` to make augmentations on pose maps.
- `--smooth-label`: smooth the label of GANloss or not.

Other arguments can be viewed in [](
Also you can directly download the models for stage II,
- [Market1501_stageII_model](
- [DukeMTMC_stageII_model](
- [CUHK03_stageII_model](

There are four models in each directory for separate nets.

### Stage III: Global finetuning
Finetune the whole framework by optimizing all parts. You can train the model with follow commands,
python --display-port 6006 --display-id 1 \
--stage 2 -d market1501 --name /directory/name/of/saving/checkpoints/ \
--pose-aug gauss -b 256 -j 4 --niter 25 --niter-decay 25 --lr 0.0001 --save-step 10 --eval-step 5 \
--lambda-recon 100.0 --lambda-veri 10.0 --lambda-sp 10.0 --smooth-label \
--netE-pretrain /path/of/100_net_E.pth --netG-pretrain /path/of/100_net_G.pth \
--netDi-pretrain /path/of/100_net_Di.pth --netDp-pretrain /path/of/100_net_Dp.pth
You can train it on specified GPUs by setting `CUDA_VISIBLE_DEVICES`.
We trained this model on a setting of batchsize 256. If you don't have such or better hardware, you may decrease the batchsize (the performance may also drop).
Or you can directly download our final model,
- [Market1501_stageIII_model](
- [DukeMTMC_stageIII_model](
- [CUHK03_stageIII_model](

And **test** `best_net_E.pth` by the same way as mentioned in [Stage I](#stageI).

- scripts for generate pose landmarks.
- generate specified images.

## Citation
Please cite our paper if you find the code useful for your research.
title={FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification},
author={Ge, Yixiao and Li, Zhuowan and Zhao, Haiyu and Yin, Guojun and Wang, Xiaogang and Li, Hongsheng},
booktitle={Advances in Neural Information Processing Systems},

## Acknowledgements
Our code is inspired by [pytorch-CycleGAN-and-pix2pix]( and [open-reid](
200 changes: 200 additions & 0 deletions
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
from __future__ import print_function, absolute_import
import argparse
import os.path as osp

import numpy as np
import os, sys
from bisect import bisect_right
import torch
from torch import nn
from torch.nn import functional as F
from torch.autograd import Variable
from torch.backends import cudnn
from import DataLoader

from reid import datasets
from reid import models
from import transforms as T
from import Preprocessor
from reid.utils.logging import Logger
from reid.utils.serialization import load_checkpoint, save_checkpoint, copy_state_dict

from import RandomPairSampler
from reid.models.embedding import EltwiseSubEmbed
from reid.models.multi_branch import SiameseNet
from reid.evaluators import CascadeEvaluator
from reid.trainers import SiameseTrainer

def get_data(name, split_id, data_dir, height, width, batch_size, workers,
combine_trainval, np_ratio):
root = osp.join(data_dir, name)

dataset = datasets.create(name, root, split_id=split_id)

normalizer = T.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

train_set = dataset.trainval if combine_trainval else dataset.train

train_transformer = T.Compose([
T.RandomSizedRectCrop(height, width),

test_transformer = T.Compose([
T.RectScale(height, width),

train_loader = DataLoader(
Preprocessor(train_set, root=dataset.images_dir,
sampler=RandomPairSampler(train_set, neg_pos_ratio=np_ratio),
batch_size=batch_size, num_workers=workers, pin_memory=False)

val_loader = DataLoader(
Preprocessor(dataset.val, root=dataset.images_dir,
batch_size=batch_size, num_workers=workers,
shuffle=False, pin_memory=False)

test_loader = DataLoader(
Preprocessor(list(set(dataset.query) | set(,
root=dataset.images_dir, transform=test_transformer),
batch_size=batch_size, num_workers=workers,
shuffle=False, pin_memory=False)

return dataset, train_loader, val_loader, test_loader

def main(args):
cudnn.benchmark = True

# Redirect print to both console and log file
if not args.evaluate:
sys.stdout = Logger(osp.join(args.logs_dir, 'log.txt'))
log_dir = osp.dirname(args.resume)
sys.stdout = Logger(osp.join(log_dir, 'log_test.txt'))
# print("==========\nArgs:{}\n==========".format(args))

# Create data loaders
if args.height is None or args.width is None:
args.height, args.width = (256, 128)
dataset, train_loader, val_loader, test_loader = \
get_data(args.dataset, args.split, args.data_dir, args.height,
args.width, args.batch_size, args.workers,
args.combine_trainval, args.np_ratio)

# Create model
base_model = models.create(args.arch, cut_at_pooling=True)
embed_model = EltwiseSubEmbed(use_batch_norm=True, use_classifier=True,
num_features=2048, num_classes=2)
model = SiameseNet(base_model, embed_model)
model = nn.DataParallel(model).cuda()

# Evaluator
evaluator = CascadeEvaluator(
embed_dist_fn=lambda x: F.softmax(Variable(x), dim=1).data[:, 0])

# Load from checkpoint
best_mAP = 0
if args.resume:
checkpoint = load_checkpoint(args.resume)
if 'state_dict' in checkpoint.keys():
checkpoint = checkpoint['state_dict']

print("Test the loaded model:")
top1, mAP = evaluator.evaluate(test_loader, dataset.query,, rerank_topk=100, dataset=args.dataset)
best_mAP = mAP

if args.evaluate:

# Criterion
criterion = nn.CrossEntropyLoss().cuda()
# Optimizer
param_groups = [
{'params': model.module.base_model.parameters(), 'lr_mult': 1.0},
{'params': model.module.embed_model.parameters(), 'lr_mult': 10.0}]
optimizer = torch.optim.SGD(param_groups,, momentum=args.momentum,
# Trainer
trainer = SiameseTrainer(model, criterion)

# Schedule learning rate
def adjust_lr(epoch):
lr = * (0.1 ** (epoch // args.step_size))
for g in optimizer.param_groups:
g['lr'] = lr * g.get('lr_mult', 1)

# Start training
for epoch in range(0, args.epochs):
trainer.train(epoch, train_loader, optimizer,

if epoch % args.eval_step==0:
mAP = evaluator.evaluate(val_loader, dataset.val, dataset.val, top1=False)
is_best = mAP > best_mAP
best_mAP = max(mAP, best_mAP)
'state_dict': model.state_dict()
}, is_best, fpath=osp.join(args.logs_dir, 'checkpoint.pth.tar'))

print('\n * Finished epoch {:3d} mAP: {:5.1%} best: {:5.1%}{}\n'.
format(epoch, mAP, best_mAP, ' *' if is_best else ''))

# Final test
print('Test with best model:')
checkpoint = load_checkpoint(osp.join(args.logs_dir, 'model_best.pth.tar'))
evaluator.evaluate(test_loader, dataset.query,, dataset=args.dataset)

if __name__ == '__main__':
parser = argparse.ArgumentParser(description="Siamese reID baseline")
# data
parser.add_argument('-d', '--dataset', type=str, default='market1501',
parser.add_argument('-b', '--batch-size', type=int, default=256)
parser.add_argument('-j', '--workers', type=int, default=4)
parser.add_argument('--split', type=int, default=0)
parser.add_argument('--height', type=int,
help="input height, default: 256 for resnet")
parser.add_argument('--width', type=int,
help="input width, default: 128 for resnet")
parser.add_argument('--combine-trainval', action='store_true',
help="train and val sets together for training, "
"val set alone for validation")
# model
parser.add_argument('-a', '--arch', type=str, default='resnet50',
# optimizer
parser.add_argument('--lr', type=float, default=0.01, help="learning rate")
parser.add_argument('--np-ratio', type=int, default=3)
parser.add_argument('--momentum', type=float, default=0.9)
parser.add_argument('--weight-decay', type=float, default=5e-4)
parser.add_argument('--step-size', type=int, default=40)
# training configs
parser.add_argument('--resume', type=str, default='', metavar='PATH')
parser.add_argument('--evaluate', action='store_true',
help="evaluation only")
parser.add_argument('--epochs', type=int, default=50)
parser.add_argument('--eval-step', type=int, default=20, help="evaluation step")
parser.add_argument('--seed', type=int, default=1)
# misc
working_dir = osp.dirname(osp.abspath(__file__))
parser.add_argument('--data-dir', type=str, metavar='PATH',
default=osp.join(working_dir, 'datasets'))
parser.add_argument('--logs-dir', type=str, metavar='PATH',
default=osp.join(working_dir, 'checkpoints'))
Empty file added fdgan/
Empty file.
32 changes: 32 additions & 0 deletions fdgan/
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from __future__ import absolute_import
import os, sys
import functools
import random

import torch
import torch.nn as nn
from torch.autograd import Variable
from torch.nn import functional as F
from torch.nn import init

class GANLoss(nn.Module):
def __init__(self, smooth=False):
super(GANLoss, self).__init__()
self.smooth = smooth

def get_target_tensor(self, input, target_is_real):
real_label = 1.0
fake_label = 0.0
if self.smooth:
real_label = random.uniform(0.7,1.0)
fake_label = random.uniform(0.0,0.3)
if target_is_real:
target_tensor = torch.ones_like(input).fill_(real_label)
target_tensor = torch.zeros_like(input).fill_(fake_label)
return target_tensor

def __call__(self, input, target_is_real):
target_tensor = self.get_target_tensor(input, target_is_real)
input = F.sigmoid(input)
return F.binary_cross_entropy(input, target_tensor)

0 comments on commit a1ca28a

Please sign in to comment.