Feature Pyramid Network on caffe
This is the unoffical version Feature Pyramid Network for Feature Pyramid Networks for Object Detection https://arxiv.org/abs/1612.03144
FPN(resnet50)-end2end result is implemented without OHEM and train with pascal voc 2007 + 2012 test on 2007
[email protected] | aeroplane | bicycle | bird | boat | bottle | bus | car | cat | chair | cow |
---|---|---|---|---|---|---|---|---|---|---|
0.7833 | 0.8585 | 0.8001 | 0.7970 | 0.7174 | 0.6522 | 0.8668 | 0.8768 | 0.8929 | 0.5842 | 0.8658 |
diningtable | dog | horse | motorbike | person | pottedplant | sheep | sofa | train | tv |
---|---|---|---|---|---|---|---|---|---|
0.7022 | 0.8891 | 0.8680 | 0.7991 | 0.7944 | 0.5065 | 0.7896 | 0.7707 | 0.8697 | 0.7653 |
the red and yellow are shared params
In the paper the anchor setting is Ratios: [0.5,1,2],scales :[8,]
With the setting and P2~P6, all anchor sizes are [32,64,128,512,1024]
,but this setting is suit for COCO dataset which has so many small targets.
but the voc dataset has so many [128,256,512]
targets.
So, we desgin the anchor setting:Ratios: [0.5,1,2],scales :[8,16]
, this is very import for voc dataset.
download voc07,12 dataset ResNet50.caffemodel
and rename to ResNet50.v2.caffemodel
cp ResNet50.v2.caffemodel data/pretrained_model/
- OneDrive download: link
In my expriments, the codes require ~10G GPU memory in training and ~6G in testing. your can design the suit image size, mimbatch size and rcnn batch size for your GPUS.
cd caffe-fpn
mkdir build
cd build
cmake ..
make -j16 all
cd lib
make
./experiments/scripts/FP_Net_end2end.sh 1 FPN pascal_voc
./test.sh 1 FPN pascal_voc
- all tests passed
- evaluate object detection performance on voc
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2016). Feature pyramid networks for object detection. arXiv preprint arXiv:1612.03144.