Keras implementation of RetinaNet object detection as described in this paper by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.
- Clone this repository.
- In the repository, execute
pip install .
. Note that due to inconsistencies with howtensorflow
should be installed, this package does not define a dependency ontensorflow
as it will try to install that throughpip
(which at least on Arch linux results in an incorrect installation). Please make suretensorflow
is installed as per your systems requirements. Also, make sure Keras 2.0.9 or above is installed as this package uses some features of 2.0.9. - As of writing, this repository requires the
master
version ofkeras-resnet
for freezingBatchNormalization
layers (ie. clone this repository and runpip install .
in that repository). - Optionally, install
pycocotools
if you want to train / test on the MS COCO dataset. Clone thecocoapi
repository and inside thePythonAPI
folder, executepip install .
.
An example on how to train keras-retinanet
can be found here.
For training on Pascal VOC, run:
python examples/train_pascal.py <path to VOCdevkit/VOC2007>
For training on MS COCO, run:
python examples/train_coco.py <path to MS COCO>
In general, the steps to train on your own datasets are:
- Create a model by calling
keras_retinanet.models.ResNet50RetinaNet
and compile it. Empirically, the following compile arguments have been found to work well:
model.compile(
loss={
'regression' : keras_retinanet.losses.regression_loss,
'classification': keras_retinanet.losses.focal_loss()
},
optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001)
)
- Create generators for training and testing data (an example is show in
keras_retinanet.preprocessing.PascalVocIterator
). These generators should generate an image batch (shaped(batch_id, height, width, channels)
) and a target batch (shaped(batch_id, num_anchors, 4 + num_classes)
). Currently, a limitation is thatbatch_size
must be equal to1
. - Use
model.fit_generator
to start training.
An example of testing the network can be seen in this Notebook. In general, output can be retrieved from the network as follows:
_, _, detections = model.predict_on_batch(inputs)
Where detections
are the resulting detections, shaped (None, None, 4 + num_classes)
(for (x1, y1, x2, y2, cls1, cls2, ...)
).
Execution time on NVIDIA Pascal Titan X is roughly 55msec for an image of shape 1000x600x3
.
The MS COCO model can be downloaded here. Results using the cocoapi
are shown below (note: according to the paper, this configuration should achieve a mAP of 0.34).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.304
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.485
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.326
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.136
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.337
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.427
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.274
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.412
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.423
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.220
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.472
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.587
- The examples show how to train
keras-retinanet
on Pascal VOC and MS COCO. Example output images are shown below.
- Allow
batch_size > 1
. - Configure CI
- This repository requires Keras 2.0.9 or above.
- This repository is tested using OpenCV 3.3 (3.0+ should be supported).
Contributions to this project are welcome.
Feel free to join the #keras-retinanet
Keras Slack channel for discussions and questions.