This tool aims to accelerate the test-time computation and decrease number of parameters of deep CNNs.
Use accnn.py
to get a new model by specifying an original model and the speeding-up ratio.
You may provide a json to explicitly control the architecture of the new model, otherwise the rank-selection algorithm would be used to do it automatically and the configuration would be saved to file config.json
.
acc_conv.py
and acc_fc.py
would be involved automatically when using accnn.py
while acc_conv.py
and acc_fc.py
can also be used seperately.
###Speedup whole network
-
Speed up a model by 2 times and use
rank-selection
to determine ranks of each layer automaticallypython accnn.py -m MODEL-PREFIX --save-model new-vgg16 --ratio 2
-
Use your own configuration file without
rank-selection
python accnn.py -m MODEL-PREFIX --save-model new-model --config YOUR-CONFIG_JSON
###Speedup a single layer
-
Decompose a convolutional layer:
python acc_conv.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-FILTER --save-model new-model
-
Decompose a fullyconnected layer:
python acc_fc.py -m MODEL-PREFIX --layer LAYER-NAME --K NUM-HIDDEN --save-model new-model
-
uses
--help
to see more options
The experiments are carried on a single machine with four Nvidia Titan X GPUs. The top-5 accuracy is evaluated on ImageNet validation dataset.
Model | Top-5 accuracy | Theoretical speed up | CPU speed up | GPU speed up |
---|---|---|---|---|
model0 | 89.6% | 1x | 1x | 1x |
model1 | 88.6% | 2.4x | 2.2x | 1.1x |
model2 | 89.8% | 2.4x | 2.2x | 1.1x |
model3 | 87.5% | 3x | 2.6x | 1.2x |
model4 | 89.6% | 3x | 2.6x | 1.2x |
model0
is the original VGG16 model directly converted from Caffe Model Zoomodel1
is the accelerated model based onconfig.json
model2
is the same asmodel1
but is fine-tuned on ImageNet training dataset for 5 epochsmodel3
is the accelerated model based on rank-selection with 3 times speeding upmodel4
is the same asmodel3
but is fine-tuned on ImageNet training dataset for 5 epochs- The experiments in GPU are carried with cuDNN 4
-
This tool is verified on the VGG-16 model converted from Caffe by
caffe_converter
tool. -
accnn.py
tool only supports single input and output -
This tool mainly implements the algorithm of Cheng et al. [2] to decompose a convolutional layer to two convolutional layers both in spatial dimensions and across channels.
acc_conv.py
provides the function to replace a(N,d,d)
conv. layer by two(K,d,1)
and(N,1,d)
conv. layers. -
The idea of
rank-selection
tool is based on the related work of Zhang et al [1] that we could use the product of PCA energy to determine the rank for each layer.
[1] Zhang, Xiangyu, et al. "Efficient and accurate approximations of nonlinear convolutional networks." arXiv preprint arXiv:1411.4229 (2014).
[2] Tai, Cheng, et al. "Convolutional neural networks with low-rank regularization." arXiv preprint arXiv:1511.06067 (2015).