Skip to content

beijinggao/faster_rcnn

Repository files navigation

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun at Microsoft Research

Introduction

Faster R-CNN is an object detection framework based on deep convolutional networks, which includes a Region Proposal Network (RPN) and an Object Detection Network. Both networks are trained for sharing convolutional layers for fast testing.

Faster R-CNN was initially described in an arXiv tech report.

This repo contains a MATLAB re-implementation of Fast R-CNN. Details about Fast R-CNN are in: rbgirshick/fast-rcnn.

This code has been tested on Windows 7/8 64-bit, Windows Server 2012 R2, and Linux, and on MATLAB 2014a.

License

Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).

Citing Faster R-CNN

If you find Faster R-CNN useful in your research, please consider citing:

@article{ren15fasterrcnn,
    Author = {Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun},
    Title = {{Faster R-CNN}: Towards Real-Time Object Detection with Region Proposal Networks},
    Journal = {arXiv preprint arXiv:1506.01497},
    Year = {2015}
}

Main resutls

                      | training data                          | test data            | mAP   | time/img

------------------------- |:--------------------------------------:|:--------------------:|:-----:|:-----: Faster RCNN, VGG-16 | VOC 2007 trainval | VOC 2007 test | 69.9% | 198ms Faster RCNN, VGG-16 | VOC 2007 trainval + 2012 trainval | VOC 2007 test | 73.2% | 198ms Faster RCNN, VGG-16 | VOC 2012 trainval | VOC 2012 test | 67.0% | 198ms Faster RCNN, VGG-16 | VOC 2007 trainval&test + 2012 trainval | VOC 2012 test | 70.4% | 198ms

Note: The mAP results are subject to random variations. We have run 5 times independently for ZF net, and the mAPs are 59.9 (as in the paper), 60.4, 59.5, 60.1, and 59.5, with a mean of 59.88 and std 0.39.

Contents

  1. Requirements: software
  2. Requirements: hardware
  3. Preparation for Testing
  4. Testing Demo
  5. Preparation for Training
  6. Training
  7. Resources

Requirements: software

  1. Caffe build for Faster R-CNN (included in this repository, see external/caffe)
    • If you are using Windows, you may download a compiled mex file by running fetch_data/fetch_caffe_mex_windows_vs2013_cuda65.m
    • If you are using Linux or you want to compile for Windows, please follow the instructions on our Caffe branch.
  2. MATLAB

Requirements: hardware

GPU: Titan, Titan Black, Titan X, K20, K40, K80.

  1. Region Proposal Network (RPN)
    • 2GB GPU memory for ZF net
    • 5GB GPU memory for VGG-16 net
  2. Ojbect Detection Network (Fast R-CNN)
    • 3GB GPU memory for ZF net
    • 8GB GPU memory for VGG-16 net

Preparation for Testing:

  1. Run fetch_data/fetch_caffe_mex_windows_vs2013_cuda65.m to download a compiled Caffe mex (for Windows only).
  2. Run faster_rcnn_build.m
  3. Run startup.m

Testing Demo:

  1. Run fetch_data/fetch_faster_rcnn_final_model.m to download our trained models.
  2. Run experiments/script_faster_rcnn_demo.m to test a single demo image.
    • You will see the timing information as below. We get the following running time on K40 @ 875 MHz and Intel Xeon CPU E5-2650 v2 @ 2.60GHz for the demo images with VGG-16:
    001763.jpg (500x375): time 0.201s (resize+conv+proposal: 0.150s, nms+regionwise: 0.052s)
    004545.jpg (500x375): time 0.201s (resize+conv+proposal: 0.151s, nms+regionwise: 0.050s)
    000542.jpg (500x375): time 0.192s (resize+conv+proposal: 0.151s, nms+regionwise: 0.041s)
    000456.jpg (500x375): time 0.202s (resize+conv+proposal: 0.152s, nms+regionwise: 0.050s)
    001150.jpg (500x375): time 0.194s (resize+conv+proposal: 0.151s, nms+regionwise: 0.043s)
    mean time: 0.198s
    and with ZF net:
    001763.jpg (500x375): time 0.061s (resize+conv+proposal: 0.032s, nms+regionwise: 0.029s)
    004545.jpg (500x375): time 0.063s (resize+conv+proposal: 0.034s, nms+regionwise: 0.029s)
    000542.jpg (500x375): time 0.052s (resize+conv+proposal: 0.034s, nms+regionwise: 0.018s)
    000456.jpg (500x375): time 0.062s (resize+conv+proposal: 0.034s, nms+regionwise: 0.028s)
    001150.jpg (500x375): time 0.058s (resize+conv+proposal: 0.034s, nms+regionwise: 0.023s)
    mean time: 0.059s
    • The visual results might be different from those in the paper due to numerical variations.

    • Running time on other GPUs

       GPU / mean time        |        VGG-16        |        ZF          
      
    :------------------------:|:--------------------:|:--------------------: K40 | 198ms | 59ms
    Titan Black | 174ms | 56ms
    Titan X | 151ms | 59ms

Preparation for Training:

  1. Run fetch_data/fetch_model_ZF.m to download an ImageNet-pre-trained ZF net.
  2. Run fetch_data/fetch_model_VGG16.m to download an ImageNet-pre-trained VGG-16 net.
  3. Download VOC 2007 and 2012 data to ./datasets

Training:

  1. Run experiments/script_faster_rcnn_VOC2007_ZF.m to train a model with ZF net. It runs four steps as follows:
    • Train RPN with conv layers tuned; compute RPN results on the train/test sets.
    • Train Fast R-CNN with conv layers tuned using step-1 RPN proposals; evaluate detection mAP.
    • Train RPN with conv layers fixed; compute RPN results on the train/test sets.
    • Train Fast R-CNN with conv layers fixed using step-3 RPN proposals; evaluate detection mAP.
    • Note: the entire training time is ~12 hours on K40.
  2. Run experiments/script_faster_rcnn_VOC2007_VGG16.m to train a model with VGG net.
    • Note: the entire training time is ~2 days on K40.
  3. Check other scripts in ./experiments for more settings.

Resources

  1. Experiment logs: OneDrive, DropBox, BaiduYun
  2. Regions proposals of our trained RPN:

If the automatic "fetch_data" fails, you may manually download resouces from:

  1. Pre-complied caffe mex:
  2. ImageNet-pretrained networks:
  3. Final RPN+FastRCNN models: OneDrive, DropBox, BaiduYun

About

Faster R-CNN

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MATLAB 94.8%
  • C++ 3.0%
  • Cuda 2.2%