Keras implementation of RetinaNet object detection as described in this paper by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.
An example on how to train keras-retinanet
can be found here.
For training on Pascal VOC, run:
python examples/train.py <path to VOCdevkit/VOC2007>
In general, the steps to train on your own datasets are:
- Create a model by calling
keras_retinanet.models.ResNet50RetinaNet
and compile it. Empirically, the following compile arguments have been found to work well:
model.compile(loss=None, optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001))
- Create generators for training and testing data (an example is show in
keras_retinanet.preprocessing.PascalVocIterator
). These generators should generate an image batch (shaped(batch_id, height, width, channels)
) and a boxes batch (shaped(batch_id, num_boxes, 5)
, where the last dimension is for(x1, y1, x2, y2, label)
). Currently, a limitation is thatbatch_size
must be equal to1
and the image shape must be defined beforehand (ie. it does not accept input images of shape(None, None, 3)
). - Use
model.fit_generator
to start training.
An example of testing the network can be seen in this Notebook. In general, output can be retrieved from the network as follows:
boxes, classification, reg_loss, cls_loss = model.predict_on_batch(inputs)
Where boxes
are the resulting bounding boxes, shaped (None, 4)
(for (x1, y1, x2, y2)
). classification
is the corresponding class scores for each box (shaped (None, num_classes)
). reg_loss
is the regression loss value and cls_loss
is the classification loss value.
Execution time on NVidia Pascal Titan X is roughly 35msec for an image of shape 512x512x3
.
- The examples show how to train
keras-retinanet
on Pascal VOC data. An example output image is shown below.
- Allow
batch_size > 1
. - Compare result w.r.t. paper results.
- Disable parts of the network when in test mode.
- Fix saving / loading of model (currently only saving / loading of weights works).
- This implementation currently uses the
softmax
activation to classify boxes. The paper mentions asigmoid
activation instead. Given the origin of parts of this code, thesoftmax
activation method was easier to implement. A comparison betweensigmoid
andsoftmax
would be interesting, but left as unexplored. - As of writing, this repository depends on an unmerged PR of
keras-resnet
. For now, it can be installed by manually installing this branch.
Any and all contributions to this project are welcome.