Keras implementation of RetinaNet object detection as described in this paper by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.
An example on how to train keras-retinanet
can be found here.
For training on Pascal VOC, run:
python examples/train_pascal.py <path to VOCdevkit/VOC2007>
For training on MS COCO, run:
python examples/train_coco.py <path to MS COCO>
In general, the steps to train on your own datasets are:
- Create a model by calling
keras_retinanet.models.ResNet50RetinaNet
and compile it. Empirically, the following compile arguments have been found to work well:
model.compile(
loss={
'regression' : keras_retinanet.losses.regression_loss,
'classification': keras_retinanet.losses.focal_loss()
},
optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001)
)
- Create generators for training and testing data (an example is show in
keras_retinanet.preprocessing.PascalVocIterator
). These generators should generate an image batch (shaped(batch_id, height, width, channels)
) and a target batch (shaped(batch_id, num_anchors, 5)
). Currently, a limitation is thatbatch_size
must be equal to1
. - Use
model.fit_generator
to start training.
An example of testing the network can be seen in this Notebook. In general, output can be retrieved from the network as follows:
_, _, detections = model.predict_on_batch(inputs)
Where detections
are the resulting detections, shaped (None, None, 4 + num_classes)
(for (x1, y1, x2, y2, bg, cls1, cls2, ...)
).
Execution time on NVIDIA Pascal Titan X is roughly 55msec for an image of shape 1000x600x3
.
The MS COCO model can be downloaded here. Results using the cocoapi
are shown below (note: according to the paper, this configuration should achieve a mAP of 0.34).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.306
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.485
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.323
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.131
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.336
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.439
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.274
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.410
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.422
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.217
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.467
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.591
- The examples show how to train
keras-retinanet
on Pascal VOC and MS COCO. Example output images are shown below.
- Allow
batch_size > 1
. - Compare result w.r.t. paper results.
- Configure CI
- This repository is tested on Keras version 2.0.8, but should also work on 2.0.7.
- This repository is tested using OpenCV 3.3 (3.0+ should be supported).
Contributions to this project are welcome.
Feel free to join the #keras-retinanet
Keras Slack channel for discussions and questions.