Keras RetinaNet

Keras implementation of RetinaNet object detection as described in this paper by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.

Training

An example on how to train keras-retinanet can be found here.

Usage

For training on Pascal VOC, run:

python examples/train.py <path to VOCdevkit/VOC2007>

In general, the steps to train on your own datasets are:

Create a model by calling keras_retinanet.models.ResNet50RetinaNet and compile it. Empirically, the following compile arguments have been found to work well:

model.compile(loss=None, optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001))

Create generators for training and testing data (an example is show in keras_retinanet.preprocessing.PascalVocIterator). These generators should generate an image batch (shaped (batch_id, height, width, channels)) and a boxes batch (shaped (batch_id, num_boxes, 5), where the last dimension is for (x1, y1, x2, y2, label)). Currently, a limitation is that batch_size must be equal to 1 and the image shape must be defined beforehand (ie. it does not accept input images of shape (None, None, 3)).
Use model.fit_generator to start training.

Testing

An example of testing the network can be seen in this Notebook. In general, output can be retrieved from the network as follows:

boxes, classification, reg_loss, cls_loss = model.predict_on_batch(inputs)

Where boxes are the resulting bounding boxes, shaped (None, 4) (for (x1, y1, x2, y2)). classification is the corresponding class scores for each box (shaped (None, num_classes)). reg_loss is the regression loss value and cls_loss is the classification loss value.

Execution time on NVidia Pascal Titan X is roughly 35msec for an image of shape 512x512x3.

Status

The examples show how to train keras-retinanet on Pascal VOC data. An example output image is shown below.

Todo's

Allow batch_size > 1.
Compare result w.r.t. paper results.
Disable parts of the network when in test mode.
Fix saving / loading of model (currently only saving / loading of weights works).

Notes

This implementation currently uses the softmax activation to classify boxes. The paper mentions a sigmoid activation instead. Given the origin of parts of this code, the softmax activation method was easier to implement. A comparison between sigmoid and softmax would be interesting, but left as unexplored.
As of writing, this repository depends on an unmerged PR of keras-resnet. For now, it can be installed by manually installing this branch.

Any and all contributions to this project are welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
examples		examples
images		images
keras_retinanet		keras_retinanet
snapshots		snapshots
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keras RetinaNet

Training

Usage

Testing

Status

Todo's

Notes

About

Releases

Packages

Languages

Jae-hyun/keras-retinanet

Folders and files

Latest commit

History

Repository files navigation

Keras RetinaNet

Training

Usage

Testing

Status

Todo's

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages