Skip to content

Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search

Notifications You must be signed in to change notification settings

zhangxujinsh/Partial-Order-Pruning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Partial-Order-Pruning

Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search, CVPR 2019


摘要(Abstract)

Achieving good speed and accuracy trade-off on target platform is very important in deploying deep neural networks. Most existing automatic architecture search approaches only pursue high performance but ignores such an important factor. In this work, we propose an algorithm "Partial Order Pruning" to prune architecture search space with partial order assumption, quickly lift the boundary of speed/accuracy trade-off on target platform, and automatically search the architecture with the best speed and accuracy trade-off. Our algorithm explicitly take profile information about the inference speed on target platform into consideration. With the proposed algorithm, we present several "Dongfeng(东风)" networks that provide high accuracy and fast inference speed on various application GPU platforms. By further searching decoder architecture, our DF-Seg real-time segmentation models yields state-of-the-art speed/accuracy trade-off on both embedded device and high-end GPU.


部分实验数据(Experiments)

1.We conduct backbone architecture searching experiments on TX2:

模型(Model) ImageNet Val. Top-1 Acc.
东风一(DF1) 69.78%
东风二(DF2) 73.92%
东风二甲(DF2A) 76.00%

2.With our Dongfeng backbone network, we conduct decoder architecture search experiments on 1080Ti and TX2:

模型(Model) Cityscapes mIoU (Val/Test) FPS(1080Ti/ TensorRT-3.0.4) FPS(Titan X/Caffe)
Resolution 1024x2048 1024x2048 1024x1024 1024x2048
DFlite-Seg-d8 71.7/- 157.4 263.4 45.7
DF1-Seg-d8 72.4/71.4 136.9 232.6 40.2
DFlite-Seg 73.4/- 118.4 202.5 33.8
DF1-Seg 74.1/73.0 106.1 182.1 30.7
DF2-Seg1 75.9/74.8 67.2 - 20.5
DF2-Seg2 76.9/75.3 56.3 - 17.7

3.Dongfeng models are designed for GPU platforms. We further conduct backbone and decoder architecture searching experiments on Snapdragon 845 CPU platform:

模型(Model) Cityscapes mIoU (Val/Test) FPS(Snapdragon 845/NCNN)
分辨率(Resolution) 1024x2048 640x384
基于霹雳一甲的分割网络(PL1A-Seg) 68.7/69.1 52.0

Many thanks to NCNN(https://github.com/Tencent/ncnn), a high-performance neural network inference framework optimized for the mobile platform.


模型文件(Snapshots):

欢迎使用“东风”系列模型,万事俱备,只欠东风!

df1.caffemodel https://drive.google.com/open?id=1yA9DLSy3PEMQD3R92vKr6CEaOJ0NrOVm

df2.caffemodel https://drive.google.com/open?id=1K0QPFD6XtKnMsrrOLSarnk3C1zmhZqpC

df2a.caffemodel https://drive.google.com/open?id=1H5T-nz1D2DCLtma-alkR_CxGAHKewbql

df1seg.caffemodel https://drive.google.com/open?id=1v-UCb1VIHGtIR9eXiPgdUu9wkDpemNTW

df1seg_mergebn.caffemodel https://drive.google.com/open?id=17ZROC9dJAN8dxkpvTzHQRTS9soCwBOXJ

df2seg1.caffemodel https://drive.google.com/open?id=1mCdozRO4BxDV-NS6secKuvaTNWEFcv_u

df2seg1_mergebn.caffemodel https://drive.google.com/open?id=1RfdYtc7YzM5zYoANsRiB-XJMkvYfy_mv

df2seg2.caffemodel https://drive.google.com/open?id=1D7bgq7h9OUQVY4LYA-x0B2FY8pnVgz3o

df2seg2_mergebn.caffemodel https://drive.google.com/open?id=1FtqRSEN90ynTgMeGH3ee5DC8ubvSHZ3z

df-lite_seg_mergebn.caffemodel https://drive.google.com/open?id=1se9wAkZFyNGYInjrhtXTMIZy39ucMaDu

About

Partial Order Pruning: for Best Speed/Accuracy Trade-off in Neural Architecture Search

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published