This project is a pytorch implementation of RetinaNet. During the implementing, I referred several implementations to make this project work:
- kuangliu/pytorch-retinanet, this repository give several main scripts to train RetinaNet, but doesn't give the results of training.
- fizyr/keras-retinanet, this repository completely give the training, test, evaluate processes, but it is based on Keras.
- roytseng-tw/Detectron.pytorch, this repository is a implementation of Detectron based on pytorch, but it doesn't support RetinaNet in this moment.
For this implementation, it has the following features:
- It supports multi-image batch training. Change the original sampler into
MinibatchSampler
to support different multi-image size in minibatch. - It supports multiple GPUs training. Change the original
DataParallel
in Pytorch to support minibatch supported dataset.
Now, I get the result using COCOAPI, the training AP is 33.1, compare to 34.0 in the original paper, the result is comparable. And the details of AP are as follow:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.331
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.500
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.353
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.154
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.371
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.459
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.279
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.429
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.459
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.238
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.520
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.598
Clone the repo:
git clone https://github.com/wsnedy/pytorch-retinanet.git
Tested under python2.7.
- python packages
- pytorch=0.3.1
- torchvision=0.2.0
- matplotlib
- numpy
- opencv
- pycocotools — for COCO dataset, also available from pip.
- An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
Download pretrained ResNet50 params from the following url.
mkdir pretrained_model
cd pretrained_model
wget https://download.pytorch.org/models/resnet50-19c8e357.pth
mv resnet50-19c8e357.pth resnet50.pth
Get the pretrained RetinaNet by run the script:
cd network
python get_state_dict.py
Download the coco images and annotations from coco website.
And make sure to put the files as the following structure:
coco
├── annotations
| ├── instances_minival2014.json
│ ├── instances_train2014.json
│ ├── instances_val2014.json
│ ├── instances_valminusminival2014.json
│ ├── ...
|
└── images
├── train2014
├── val2014
├── ...
When training, change the root path to your own data path.
For the hyper-parameters, I just put them in the scripts, and I will put all the hyper-parameters in a config file. The setting is as follows:
- For multi-gpus, it will use all the available gpus in default. change the
device_ids
inbin/train.py
if you want to specific gpus. - For batch_size, I use
batch_size = 24
, if you want to change, you have to change two places,batch_size=24
anditeration_per_epoch = int(len(dataloader) / 24.)
inbin/train.py
- For img_per_minibatch, I use
img_per_minibatch = 3
to achievebatch_size=24
, change it if you want to use other minibatch number. - To do, put all the hyper-parameters into config file.
Training RetinaNet using following code, and after each epoch, it will give a evaluation in minival2014
dataset:
python bin/train.py
If you want to load the checkpoint, use the follow code. If you have more than one checkpoint, change the code
checkpoint = torch.load('../checkpoint/ckpt.pth')
in train.py
to load different checkpoint:
python bin/train.py --resume
- To do:
demo.py
Only COCO supported now, for different dataset, change a little bit in datasets
will be work.
- To do: support
VOC
dataset.