Skip to content

Keras implementation of RetinaNet object detection.

Notifications You must be signed in to change notification settings

Jae-hyun/keras-retinanet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Keras RetinaNet

Keras implementation of RetinaNet object detection as described in this paper by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.

Training

An example on how to train keras-retinanet can be found here.

Usage

For training on Pascal VOC, run:

python examples/train.py <path to VOCdevkit/VOC2007>

In general, the steps to train on your own datasets are:

  1. Create a model by calling keras_retinanet.models.ResNet50RetinaNet and compile it. Empirically, the following compile arguments have been found to work well:
model.compile(loss=None, optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001))
  1. Create generators for training and testing data (an example is show in keras_retinanet.preprocessing.PascalVocIterator). These generators should generate an image batch (shaped (batch_id, height, width, channels)) and a boxes batch (shaped (batch_id, num_boxes, 5), where the last dimension is for (x1, y1, x2, y2, label)). Currently, a limitation is that batch_size must be equal to 1 and the image shape must be defined beforehand (ie. it does not accept input images of shape (None, None, 3)).
  2. Use model.fit_generator to start training.

Testing

An example of testing the network can be seen in this Notebook. In general, output can be retrieved from the network as follows:

boxes, classification, reg_loss, cls_loss = model.predict_on_batch(inputs)

Where boxes are the resulting bounding boxes, shaped (None, 4) (for (x1, y1, x2, y2)). classification is the corresponding class scores for each box (shaped (None, num_classes)). reg_loss is the regression loss value and cls_loss is the classification loss value.

Execution time on NVidia Pascal Titan X is roughly 35msec for an image of shape 512x512x3.

Status

  • The examples show how to train keras-retinanet on Pascal VOC data. An example output image is shown below.

Example result of RetinaNet on Pascal VOC

Todo's

  • Allow batch_size > 1.
  • Compare result w.r.t. paper results.
  • Disable parts of the network when in test mode.
  • Fix saving / loading of model (currently only saving / loading of weights works).

Notes

  • This implementation currently uses the softmax activation to classify boxes. The paper mentions a sigmoid activation instead. Given the origin of parts of this code, the softmax activation method was easier to implement. A comparison between sigmoid and softmax would be interesting, but left as unexplored.
  • As of writing, this repository depends on an unmerged PR of keras-resnet. For now, it can be installed by manually installing this branch.

Any and all contributions to this project are welcome.

About

Keras implementation of RetinaNet object detection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%