By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun at Microsoft Research
Faster R-CNN is an object detection framework based on deep convolutional networks, which includes a Region Proposal Network (RPN) and an Object Detection Network. Both networks are trained for sharing convolutional layers for fast testing.
Faster R-CNN was initially described in an arXiv tech report.
This repo contains a MATLAB re-implementation of Fast R-CNN. Details about Fast R-CNN are in: rbgirshick/fast-rcnn.
This code has been tested on Windows 7/8 64-bit, Windows Server 2012 R2, and Linux, and on MATLAB 2014a.
Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).
If you find Faster R-CNN useful in your research, please consider citing:
@article{ren15fasterrcnn,
Author = {Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun},
Title = {{Faster R-CNN}: Towards Real-Time Object Detection with Region Proposal Networks},
Journal = {arXiv preprint arXiv:1506.01497},
Year = {2015}
}
| training data | test data | mAP | time/img
------------------------- |:--------------------------------------:|:--------------------:|:-----:|:-----: Faster RCNN, VGG-16 | VOC 2007 trainval | VOC 2007 test | 69.9% | 198ms Faster RCNN, VGG-16 | VOC 2007 trainval + 2012 trainval | VOC 2007 test | 73.2% | 198ms Faster RCNN, VGG-16 | VOC 2012 trainval | VOC 2012 test | 67.0% | 198ms Faster RCNN, VGG-16 | VOC 2007 trainval&test + 2012 trainval | VOC 2012 test | 70.4% | 198ms
Note: The mAP results are subject to random variations. We have run 5 times independently for ZF net, and the mAPs are 59.9 (as in the paper), 60.4, 59.5, 60.1, and 59.5, with a mean of 59.88 and std 0.39.
- Requirements: software
- Requirements: hardware
- Preparation for Testing
- Testing Demo
- Preparation for Training
- Training
- Resources
Caffe
build for Faster R-CNN (included in this repository, seeexternal/caffe
)- If you are using Windows, you may download a compiled mex file by running
fetch_data/fetch_caffe_mex_windows_vs2013_cuda65.m
- If you are using Linux or you want to compile for Windows, please follow the instructions on our Caffe branch.
- If you are using Windows, you may download a compiled mex file by running
- MATLAB
GPU: Titan, Titan Black, Titan X, K20, K40, K80.
- Region Proposal Network (RPN)
- 2GB GPU memory for ZF net
- 5GB GPU memory for VGG-16 net
- Ojbect Detection Network (Fast R-CNN)
- 3GB GPU memory for ZF net
- 8GB GPU memory for VGG-16 net
- Run
fetch_data/fetch_caffe_mex_windows_vs2013_cuda65.m
to download a compiled Caffe mex (for Windows only). - Run
faster_rcnn_build.m
- Run
startup.m
- Run
fetch_data/fetch_faster_rcnn_final_model.m
to download our trained models. - Run
experiments/script_faster_rcnn_demo.m
to test a single demo image.- You will see the timing information as below. We get the following running time on K40 @ 875 MHz and Intel Xeon CPU E5-2650 v2 @ 2.60GHz for the demo images with VGG-16:
and with ZF net:001763.jpg (500x375): time 0.201s (resize+conv+proposal: 0.150s, nms+regionwise: 0.052s) 004545.jpg (500x375): time 0.201s (resize+conv+proposal: 0.151s, nms+regionwise: 0.050s) 000542.jpg (500x375): time 0.192s (resize+conv+proposal: 0.151s, nms+regionwise: 0.041s) 000456.jpg (500x375): time 0.202s (resize+conv+proposal: 0.152s, nms+regionwise: 0.050s) 001150.jpg (500x375): time 0.194s (resize+conv+proposal: 0.151s, nms+regionwise: 0.043s) mean time: 0.198s
001763.jpg (500x375): time 0.061s (resize+conv+proposal: 0.032s, nms+regionwise: 0.029s) 004545.jpg (500x375): time 0.063s (resize+conv+proposal: 0.034s, nms+regionwise: 0.029s) 000542.jpg (500x375): time 0.052s (resize+conv+proposal: 0.034s, nms+regionwise: 0.018s) 000456.jpg (500x375): time 0.062s (resize+conv+proposal: 0.034s, nms+regionwise: 0.028s) 001150.jpg (500x375): time 0.058s (resize+conv+proposal: 0.034s, nms+regionwise: 0.023s) mean time: 0.059s
-
The visual results might be different from those in the paper due to numerical variations.
-
Running time on other GPUs
GPU / mean time | VGG-16 | ZF
Titan Black | 174ms | 56ms
Titan X | 151ms | 59ms
- Run
fetch_data/fetch_model_ZF.m
to download an ImageNet-pre-trained ZF net. - Run
fetch_data/fetch_model_VGG16.m
to download an ImageNet-pre-trained VGG-16 net. - Download VOC 2007 and 2012 data to ./datasets
- Run
experiments/script_faster_rcnn_VOC2007_ZF.m
to train a model with ZF net. It runs four steps as follows:- Train RPN with conv layers tuned; compute RPN results on the train/test sets.
- Train Fast R-CNN with conv layers tuned using step-1 RPN proposals; evaluate detection mAP.
- Train RPN with conv layers fixed; compute RPN results on the train/test sets.
- Train Fast R-CNN with conv layers fixed using step-3 RPN proposals; evaluate detection mAP.
- Note: the entire training time is ~12 hours on K40.
- Run
experiments/script_faster_rcnn_VOC2007_VGG16.m
to train a model with VGG net.- Note: the entire training time is ~2 days on K40.
- Check other scripts in
./experiments
for more settings.
If the automatic "fetch_data" fails, you may manually download resouces from: