forked from PaddlePaddle/FastDeploy
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Doc] Update multi_thread docs in tutorials (PaddlePaddle#886)
* Refactor PaddleSeg with preprocessor && postprocessor * Fix bugs * Delete redundancy code * Modify by comments * Refactor according to comments * Add batch evaluation * Add single test script * Add ppliteseg single test script && fix eval(raise) error * fix bug * Fix evaluation segmentation.py batch predict * Fix segmentation evaluation bug * Fix evaluation segmentation bugs * Update segmentation result docs * Update old predict api and DisableNormalizeAndPermute * Update resize segmentation label map with cv::INTER_NEAREST * Add Model Clone function for PaddleClas && PaddleDet && PaddleSeg * Add multi thread demo * Add python model clone function * Add multi thread python && C++ example * Fix bug * Update python && cpp multi_thread examples * Add cpp && python directory * Add README.md for examples * Delete redundant code * Create README_CN.md * Rename README_CN.md to README.md * Update README.md * Update README.md Co-authored-by: Jason <[email protected]>
- Loading branch information
1 parent
3164af6
commit e4b1581
Showing
3 changed files
with
136 additions
and
109 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
[English](README.md) | 中文 | ||
|
||
# FastDeploy模型多线程或多进程预测的使用 | ||
|
||
FastDeploy针对python和cpp开发者,提供了以下多线程或多进程的示例 | ||
|
||
- [python多线程以及多进程预测的使用示例](python) | ||
- [cpp多线程预测的使用示例](cpp) | ||
|
||
## 多线程预测时克隆模型 | ||
|
||
针对一个视觉模型的推理包含3个环节 | ||
- 输入图像,图像经过预处理,最终得到要输入给模型Runtime的Tensor,即preprocess阶段 | ||
- 模型Runtime接收Tensor,进行推理,得到Runtime的输出Tensor,即infer阶段 | ||
- 对Runtime的输出Tensor做后处理,得到最后的结构化信息,如DetectionResult, SegmentationResult等等,即postprocess阶段 | ||
|
||
针对以上preprocess、infer、postprocess三个阶段,FastDeploy分别抽象出了三个对应的类,即Preprocessor、Runtime、PostProcessor | ||
|
||
在多线程调用FastDeploy中的模型进行并行推理的时候,要考虑几个问题 | ||
- Preprocessor、Runtime、Postprocessor三个类能否分别支持并行处理 | ||
- 在支持多线程并发的前提下,能否最大限度的减少内存或显存占用 | ||
|
||
FastDeploy采用分别拷贝多个对象的方式,进行多线程推理,即每个线程都有一份独立的Preprocessor、Runtime、PostProcessor的实例化的对象。而为了减少内存的占用,对于Runtime的拷贝则采用共享模型权重的方式进行拷贝。因此,虽然复制了多个对象,但对于模型权重和参数在内存或显存中只有一份。 | ||
以此减少拷贝多个对象带来的内存占用。 | ||
|
||
FastDeploy提供如下接口,来进行模型的clone(以PaddleClas为例) | ||
|
||
- Python: `PaddleClasModel.clone()` | ||
- C++: `PaddleClasModel::Clone()` | ||
|
||
|
||
### Python | ||
``` | ||
import fastdeploy as fd | ||
option = fd.RuntimeOption() | ||
model = fd.vision.classification.PaddleClasModel(model_file, | ||
params_file, | ||
config_file, | ||
runtime_option=option) | ||
model2 = model.clone() | ||
im = cv2.imread(image) | ||
res = model.predict(im) | ||
``` | ||
|
||
### C++ | ||
``` | ||
auto model = fastdeploy::vision::classification::PaddleClasModel(model_file, | ||
params_file, | ||
config_file, | ||
option); | ||
auto model2 = model.Clone(); | ||
auto im = cv::imread(image_file); | ||
fastdeploy::vision::ClassifyResult res; | ||
model->Predict(im, &res) | ||
``` | ||
|
||
>> **注意**:其他模型类似API接口可查阅[官方C++文档](https://www.paddlepaddle.org.cn/fastdeploy-api-doc/cpp/html/index.html)以及[官方Python文档](https://www.paddlepaddle.org.cn/fastdeploy-api-doc/python/html/index.html) | ||
## Python多线程以及多进程 | ||
|
||
Python由于语言的限制即GIL锁的存在,在计算密集型的场景下,多线程无法充分利用硬件的性能。因此,Python上提供多进程和多线程两种示例。其异同点如下: | ||
|
||
### FastDeploy模型多进程与多线程推理的比较 | ||
|
||
| | 资源占用 | 计算密集型 | I/O密集型 | 进程或线程间通信 | | ||
|:-------|:------|:----------|:----------|:----------| | ||
| 多进程 | 大 | 快 | 快 | 慢| | ||
| 多线程 | 小 | 慢 | 较快 |快| | ||
|
||
>> **注意**:以上分析相对理论,实际上Python针对不同的计算任务也做出了一定的优化,像是numpy类的计算已经可以做到并行计算,同时由于多进程间的result汇总涉及到进程间通信,而且往往有时候很难鉴别该任务是计算密集型还是I/O密集型,所以一切都需要根据任务进行测试而定。 | ||
|
||
## C++多线程 | ||
|
||
C++的多线程,兼具了占用资源少,速度快的特点。因此,是使用多线程推理的最佳选择 | ||
|
||
### C++ 多线程Clone与不Clone内存占用对比 | ||
|
||
硬件:Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz | ||
模型:ResNet50_vd_infer | ||
后端:CPU OPENVINO后端推理引擎 | ||
|
||
单进程内初始化多个模型,内存占用 | ||
| 模型数 | model.Clone()后 | Clone后model->predict()后 | 不Clone模型初始化后| 不Clone后model->predict()后 | | ||
|:--- |:----- |:----- |:----- |:----- | | ||
|1|322M |325M |322M|325M| | ||
|2|322M|325M|559M|560M| | ||
|3|322M|325M|771M|771M| | ||
|
||
模型多线程预测内存占用 | ||
| 线程数 | model.Clone()后 | Clone后model->predict()后 | 不Clone模型初始化后| 不Clone后model->predict()后 | | ||
|:--- |:----- |:----- |:----- |:----- | | ||
|1|322M |337M |322M|337M| | ||
|2|322M|343M|548M|566M| | ||
|3|322M|347M|752M|784M| | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,77 +1,50 @@ | ||
# PaddleClas模型 Python部署示例 | ||
# PaddleClas模型 Python多线程/进程部署示例 | ||
|
||
在部署前,需确认以下两个步骤 | ||
|
||
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md) | ||
- 2. FastDeploy Python whl包安装,参考[FastDeploy Python安装](../../../../../docs/cn/build_and_install/download_prebuilt_libraries.md) | ||
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../docs/cn/build_and_install/download_prebuilt_libraries.md) | ||
- 2. FastDeploy Python whl包安装,参考[FastDeploy Python安装](../../../docs/cn/build_and_install/download_prebuilt_libraries.md) | ||
|
||
本目录下提供`multi_thread_process.py`快速完成ResNet50_vd在CPU/GPU,以及GPU上通过TensorRT加速部署的多线程/进程示例。执行如下脚本即可完成 | ||
|
||
本目录下提供`infer.py`快速完成ResNet50_vd在CPU/GPU,以及GPU上通过TensorRT加速部署的示例。执行如下脚本即可完成 | ||
|
||
```bash | ||
#下载部署示例代码 | ||
git clone https://github.com/PaddlePaddle/FastDeploy.git | ||
cd FastDeploy/examples/vision/classification/paddleclas/python | ||
cd FastDeploy/tutorials/multi_thread/python | ||
|
||
# 下载ResNet50_vd模型文件和测试图片 | ||
wget https://bj.bcebos.com/paddlehub/fastdeploy/ResNet50_vd_infer.tgz | ||
tar -xvf ResNet50_vd_infer.tgz | ||
wget https://gitee.com/paddlepaddle/PaddleClas/raw/release/2.4/deploy/images/ImageNet/ILSVRC2012_val_00000010.jpeg | ||
|
||
# CPU推理 | ||
python infer.py --model ResNet50_vd_infer --image ILSVRC2012_val_00000010.jpeg --device cpu --topk 1 | ||
# GPU推理 | ||
python infer.py --model ResNet50_vd_infer --image ILSVRC2012_val_00000010.jpeg --device gpu --topk 1 | ||
# GPU上使用TensorRT推理 (注意:TensorRT推理第一次运行,有序列化模型的操作,有一定耗时,需要耐心等待) | ||
python infer.py --model ResNet50_vd_infer --image ILSVRC2012_val_00000010.jpeg --device gpu --use_trt True --topk 1 | ||
# IPU推理(注意:IPU推理首次运行会有序列化模型的操作,有一定耗时,需要耐心等待) | ||
python infer.py --model ResNet50_vd_infer --image ILSVRC2012_val_00000010.jpeg --device ipu --topk 1 | ||
|
||
# CPU多线程推理 | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device cpu --topk 1 --thread_num 1 | ||
# CPU多进程推理 | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device cpu --topk 1 --use_multi_process True --process_num 1 | ||
|
||
# GPU多线程推理 | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device gpu --topk 1 --thread_num 1 | ||
# GPU多进程推理 | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device gpu --topk 1 --use_multi_process True --process_num 1 | ||
|
||
# GPU上使用TensorRT多线程推理 (注意:TensorRT推理第一次运行,有序列化模型的操作,有一定耗时,需要耐心等待) | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device gpu --use_trt True --topk 1 --thread_num 1 | ||
# GPU上使用TensorRT多进程推理 (注意:TensorRT推理第一次运行,有序列化模型的操作,有一定耗时,需要耐心等待) | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device gpu --use_trt True --topk 1 --use_multi_process True --process_num 1 | ||
|
||
# IPU多线程推理(注意:IPU推理首次运行会有序列化模型的操作,有一定耗时,需要耐心等待) | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device ipu --topk 1 --thread_num 1 | ||
# IPU多进程推理(注意:IPU推理首次运行会有序列化模型的操作,有一定耗时,需要耐心等待) | ||
python infer.py --model ResNet50_vd_infer --image_path ILSVRC2012_val_00000010.jpeg --device ipu --topk 1 --use_multi_process True --process_num 1 | ||
``` | ||
>> **注意**: `--image_path` 可以输入图片文件夹的路径 | ||
运行完成后返回结果如下所示 | ||
```bash | ||
ClassifyResult( | ||
label_ids: 153, | ||
scores: 0.686229, | ||
) | ||
``` | ||
|
||
## PaddleClasModel Python接口 | ||
|
||
```python | ||
fd.vision.classification.PaddleClasModel(model_file, params_file, config_file, runtime_option=None, model_format=ModelFormat.PADDLE) | ||
``` | ||
|
||
PaddleClas模型加载和初始化,其中model_file, params_file为训练模型导出的Paddle inference文件,具体请参考其文档说明[模型导出](https://github.com/PaddlePaddle/PaddleClas/blob/release/2.4/docs/zh_CN/inference_deployment/export_model.md#2-%E5%88%86%E7%B1%BB%E6%A8%A1%E5%9E%8B%E5%AF%BC%E5%87%BA) | ||
|
||
**参数** | ||
|
||
> * **model_file**(str): 模型文件路径 | ||
> * **params_file**(str): 参数文件路径 | ||
> * **config_file**(str): 推理部署配置文件 | ||
> * **runtime_option**(RuntimeOption): 后端推理配置,默认为None,即采用默认配置 | ||
> * **model_format**(ModelFormat): 模型格式,默认为Paddle格式 | ||
### predict函数 | ||
|
||
> ```python | ||
> PaddleClasModel.predict(input_image, topk=1) | ||
> ``` | ||
> | ||
> 模型预测结口,输入图像直接输出分类topk结果。 | ||
> | ||
> **参数** | ||
> | ||
> > * **input_image**(np.ndarray): 输入数据,注意需为HWC,BGR格式 | ||
> > * **topk**(int):返回预测概率最高的topk个分类结果,默认为1 | ||
> **返回** | ||
> | ||
> > 返回`fastdeploy.vision.ClassifyResult`结构体,结构体说明参考文档[视觉模型预测结果](../../../../../docs/api/vision_results/) | ||
## 其它文档 | ||
- [PaddleClas 模型介绍](..) | ||
- [PaddleClas C++部署](../cpp) | ||
- [模型预测结果说明](../../../../../docs/api/vision_results/) | ||
- [如何切换模型推理后端引擎](../../../../../docs/cn/faq/how_to_change_backend.md) | ||
``` |