Skip to content

Commit

Permalink
Add latency predictor function and doc (PaddlePaddle#905)
Browse files Browse the repository at this point in the history
  • Loading branch information
ZichaoGuo authored Oct 19, 2021
1 parent 2550fb4 commit c7ec858
Show file tree
Hide file tree
Showing 7 changed files with 1,099 additions and 2 deletions.
63 changes: 63 additions & 0 deletions demo/analysis/latency_predictor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
import os
import subprocess
import argparse

import paddle
from paddleslim.analysis import TableLatencyPredictor

from paddle.vision.models import mobilenet_v1, mobilenet_v2

opt_tool = 'opt_ubuntu' # use in linux
# opt_tool = 'opt_M1_mac' # use in mac with M1 chip
# opt_tool = 'opt_intel_mac' # use in mac with intel chip

parser = argparse.ArgumentParser(description='latency predictor')
parser.add_argument('--model', type=str, help='which model to test.')
parser.add_argument('--data_type', type=str, default='fp32')

args = parser.parse_args()

if not os.path.exists(opt_tool):
subprocess.call(
f'wget https://paddle-slim-models.bj.bcebos.com/LatencyPredictor/{opt_tool}',
shell=True)
subprocess.call(f'chmod +x {opt_tool}', shell=True)


def get_latency(model, data_type):
paddle.disable_static()
predictor = TableLatencyPredictor(
f'./{opt_tool}', hardware='845', threads=4, power_mode=3, batchsize=1)
latency = predictor.predict_latency(
model,
input_shape=[1, 3, 224, 224],
save_dir='./tmp_model',
data_type=data_type,
task_type='cls')
print('{} latency : {}'.format(data_type, latency))

subprocess.call('rm -rf ./tmp_model', shell=True)
paddle.disable_static()
return latency


if __name__ == '__main__':
if args.model == 'mobilenet_v1':
model = mobilenet_v1()
elif args.model == 'mobilenet_v2':
model = mobilenet_v2()
else:
assert False, f'model should be mobilenet_v1 or mobilenet_v2'

latency = get_latency(model, args.data_type)

if args.model == 'mobilenet_v1' and args.data_type == 'fp32':
assert latency == 41.92806607483133
elif args.model == 'mobilenet_v1' and args.data_type == 'int8':
assert latency == 36.64814722993898
elif args.model == 'mobilenet_v2' and args.data_type == 'fp32':
assert latency == 27.847896889217566
elif args.model == 'mobilenet_v2' and args.data_type == 'int8':
assert latency == 23.967800360138803
else:
assert False, f'model or data_type wrong.'
8 changes: 8 additions & 0 deletions docs/zh_cn/tutorials/analysis/dygraph/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@

动态图
==============

.. toctree::
:maxdepth: 1

latency_predictor.md
35 changes: 35 additions & 0 deletions docs/zh_cn/tutorials/analysis/dygraph/latency_predictor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# LatencyPredictor使用教程

LatencyPredictor主要功能是根据提供的op-latency映射表,预估神经网络网络在特定硬件设备上的实际耗时。它基于Paddle-Lite开发,适用于使用Paddle-Lite部署的模型。映射表以key-value的形式存储,key包含了神经网络模型经过Paddle-Lite图优化后的各种融合op信息,value则代表在特定硬件上的实际耗时。

## 使用方法

1. 下载或自行编译opt优化工具
2. 构建LatencyPredictor
3. 定义模型和预测

### 1. 下载或自行编译opt优化工具
1.1 下载提供的opt工具,可根据运行环境下载适用的opt,目前提供Mac平台([M1芯片](https://paddle-slim-models.bj.bcebos.com/LatencyPredictor/opt_M1_mac)[Intel芯片](https://paddle-slim-models.bj.bcebos.com/LatencyPredictor/opt_intel_mac))和[Ubuntu](https://paddle-slim-models.bj.bcebos.com/LatencyPredictor/opt_ubuntu)平台的opt工具下载。
1.2 也可以自行通过Paddle-Lite源码编译opt工具,具体请参考请参考Paddle-Lite[文档](https://paddle-lite.readthedocs.io/zh/latest/user_guides/model_optimize_tool.html)。编译时需要关闭Paddle-Lite的内存复用功能,即注释掉这[几行代码](https://github.com/PaddlePaddle/Paddle-Lite/blob/d76f45be989d3e01cebf2ac18e047cfd37d52666/lite/core/optimizer/optimizer.cc#L266-L268)

### 2. 构建LatencyPredictor

提供opt工具路径,以及芯片和测试参数信息,LatencyPredictor会根据这些参数自动下载对应的映射表。如下所示,芯片为845芯片,测试线程数threads为4,测速模式power_mode为3,测试batchsize为1.
```
import paddleslim
opt_path = {opt工具路径}
predictor = paddleslim.TableLatencyPredictor(opt_path, hardware='845', threads=4, power_mode=3, batchsize=1)
```

### 3. 定义模型和预测

定义model后可通过predict_latency函数直接预测模型推理耗时,其中,input_shape为输入大小,save_dir为中间pbmodel模型保存路径,data_type可选fp32或int8,task_type=‘cls'表示该模型为分类模型。
```
import paddle
from paddle.vision.models import mobilenet_v1
model = mobilenet_v1()
latency = predictor.predict_latency(model, input_shape=[1,3,224,224], save_dir='./model', data_type='int8', task_type='cls')
print('predicted latency = {}ms'.format(latency))
```
15 changes: 13 additions & 2 deletions paddleslim/analysis/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,19 @@
from .flops import flops, dygraph_flops
from .model_size import model_size
from .latency import LatencyEvaluator, TableLatencyEvaluator
from .latency_predictor import LatencyPredictor, TableLatencyPredictor
from ._utils import get_key_from_op, save_cls_model, save_det_model, save_seg_model

__all__ = [
'flops', 'dygraph_flops', 'model_size', 'LatencyEvaluator',
'TableLatencyEvaluator'
'flops',
'dygraph_flops',
'model_size',
'LatencyEvaluator',
'TableLatencyEvaluator',
"LatencyPredictor",
"TableLatencyPredictor",
"get_key_from_op",
"save_cls_model",
"save_det_model",
"save_seg_model",
]
Loading

0 comments on commit c7ec858

Please sign in to comment.