-
This is a forked version of Faster RCNN. Please refer to the original README.md file for more information.
-
The implementation of the code was an attempt to reproduce andrewliao11's work on ImageNet's dataset.
-
The code ran on a AWS's g2.2xlarge instance (~2.8s/iter) and later configured to run locally on a desktop with GTX 1070 (CUDA 8.0, cuDNNv5, ~0.45s/iter)
It took some time to get it working on my desktop (GTX 1070, CUDA 8.0, cuDNNv5, gcc 5.4). Some key steps:
- Install newest NVIDIA driver: I find this post to be rather useful.
- Install CUDA 8.0: run with
--override
to suppress compiler error (gcc); official documentation
./cuda_8.0.27_linux.run --override
- Install cuDNNv5: this post might help
- Suppress compiler error (again): simply a hack
vi /usr/local/cuda/include/host_config.h
comment the following line (L115):
//#error -- unsupported GNU version! gcc versions later than 5.3 are not supported!
- Modify Caffe: #237
- Other issues: some of the required libraries like
protobuf
are compiled in older version ofgcc
when you useapt-get install
to install them. This might cause errors later. To get around with this, download the corresponding repo from github and compile it from src.
I'm using the ILSVRC 2013 Validation set (~2.7 GB, you can download images data, annotations and devkit here).
My organization of the database has the following structure (I saved it to ~/):
ILSVRC13
└─── LSVRC2013_DET_val
│ *.JPEG (e.g. ILSVRC2012_val_00000001.JPEG)
└─── data
│ meta_det.mat
└─── det_lists
│ *.txt (e.g. val.txt)
└─── Annotations
│ *.xml (e.g. ILSVRC2012_val_00000001.xml)
It is convenient to create a symbolic link for the code to refer to the data (assuming $FRCN_ROOT is directory of the repo for convention, e.g. blablabla/py-faster-rcnn):
cd $FRCN_ROOT
ln -s ~/ILSVRC13 ./data/ILSVRCdevkit2013
Next I wrote a small python script (sklearn needed) to shuffle split the val.txt
into val_train.txt
and val_test.txt
(test size is 0.25), which reside in the same directory as val.txt
.
cd $FRCN_ROOT
python ./tools/shuffle_split.py --des ./data/ILSVRC13/data/det_lists/val.txt
Most of the changes are related to data import (creating roidb for RPN). Some have to deal with prototxt files. Please refer to the corresponding file(s) for details.
- Add a new path for ILSVRC in
faster_rcnn_end2end.sh
- Edit
./lib/datasets/factory.py
to pass the correct arguments intoilsvrc
object - Create a new class
ilsvrc.py
resemblingpascal_voc.py
__init__()
- read class from meta_det.mat and index them (see this)
- change suffix to .JPEG
- comment 'use_diff' (no such thing in ImageNet annotations)
_get_default_path()
- change to your devkit path (symbolic link)
_load_image_set_index()
- change path to your val_train.txt/val_test.txt
_load_ilsvrc_annotation()
(changed from_load_pascal_annotation()
)- point it to the annotations folder
- comment out the 'use_diff' part
- change pixel index (see this)
Here I will use the orginal implementation of Faster RCNN to illustrate the changes.
solver.prototxt
- Change this to the correct
train.prototxt
directory
- Change this to the correct
train.prototxt
test.prototxt
To run the end-to-end training:
cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_end2end.sh 0 VGG16 ilsvrc --set RNG_SEED 42
After running ~28,000 iterations on AWS, the problem stopped somehow (maybe I did something). And the implemented snapshot method in Python has no way to restore the training state, meaning I have to run it again. To prevent similar things from happening again, I decided to use the Caffe's original snapshot.
The fix refers to the solution to Issue#35.
$FRCN_ROOT/tools/train_net.py
$FRCN_ROOT/lib/fast_rcnn/train.py
$FRCN_ROOT/models/ilsvrc/VGG16/faster_rcnn_end2end/solver.prototxt
$FRCN_ROOT/experiments/scripts/faster_rcnn_end2end.sh
- Modification #7 (comment the
--weights
and uncomment this line if you want to restore previous state)
- Modification #7 (comment the
Warning: Don't use the .caffemodel
ported directly by Caffe to test the results. Use the one output by snapshot() (the one stored in $FRCN_ROOT/output/faster_rcnn_end2end/ilsvrc_2013_det_val_train
)
To be continued...