GitHub - funykatebird/PaddleOCR: OCR toolkit based on PaddlePaddle （基于飞桨的OCR工具库，包含总模型仅8.6M的超轻量级中文OCR，同时支持多种文本检测、文本识别的训练算法。）

English | 简体中文

简介

PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力使用者训练出更好的模型，并应用落地。

近期更新

2020.6.8 添加数据集，并保持持续更新
2020.6.5 支持 attetnion 模型导出 inference_model
2020.6.5 支持单独预测识别时，输出结果得分
2020.5.30 提供超轻量级中文OCR在线体验
2020.5.30 模型预测、训练支持Windows系统
more

特性

超轻量级中文OCR，总模型仅8.6M
- 单模型支持中英文数字组合识别、竖排文本识别、长文本识别
- 检测模型DB（4.1M）+识别模型CRNN（4.5M）
多种文本检测训练算法，EAST、DB
多种文本识别训练算法，Rosetta、CRNN、STAR-Net、RARE

支持的中文模型列表:

模型名称	模型简介	检测模型地址	识别模型地址
chinese_db_crnn_mobile	超轻量级中文OCR模型	inference模型 & 预训练模型	inference模型 & 预训练模型
chinese_db_crnn_server	通用中文OCR模型	inference模型 & 预训练模型	inference模型 & 预训练模型

超轻量级中文OCR在线体验地址：https://www.paddlepaddle.org.cn/hub/scene/ocr

也可以按如下教程快速体验超轻量级中文OCR和通用中文OCR模型。

超轻量级中文OCR以及通用中文OCR体验

上图是超轻量级中文OCR模型效果展示，更多效果图请见文末超轻量级中文OCR效果展示和通用中文OCR效果展示。

1.环境配置

请先参考快速安装配置PaddleOCR运行环境。

2.inference模型下载

windows 环境下如果没有安装wget,下载模型时可将链接复制到浏览器中下载，并解压放置在相应目录下

(1)超轻量级中文OCR模型下载

mkdir inference && cd inference
# 下载超轻量级中文OCR模型的检测模型并解压
wget https://paddleocr.bj.bcebos.com/ch_models/ch_det_mv3_db_infer.tar && tar xf ch_det_mv3_db_infer.tar
# 下载超轻量级中文OCR模型的识别模型并解压
wget https://paddleocr.bj.bcebos.com/ch_models/ch_rec_mv3_crnn_infer.tar && tar xf ch_rec_mv3_crnn_infer.tar
cd ..

(2)通用中文OCR模型下载

mkdir inference && cd inference
# 下载通用中文OCR模型的检测模型并解压
wget https://paddleocr.bj.bcebos.com/ch_models/ch_det_r50_vd_db_infer.tar && tar xf ch_det_r50_vd_db_infer.tar
# 下载通用中文OCR模型的识别模型并解压
wget https://paddleocr.bj.bcebos.com/ch_models/ch_rec_r34_vd_crnn_infer.tar && tar xf ch_rec_r34_vd_crnn_infer.tar
cd ..

3.单张图像或者图像集合预测

以下代码实现了文本检测、识别串联推理，在执行预测时，需要通过参数image_dir指定单张图像或者图像集合的路径、参数det_model_dir指定检测inference模型的路径和参数rec_model_dir指定识别inference模型的路径。可视化识别结果默认保存到 ./inference_results 文件夹里面。

# 预测image_dir指定的单张图像
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/"  --rec_model_dir="./inference/ch_rec_mv3_crnn/"

# 预测image_dir指定的图像集合
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/" --det_model_dir="./inference/ch_det_mv3_db/"  --rec_model_dir="./inference/ch_rec_mv3_crnn/"

# 如果想使用CPU进行预测，需设置use_gpu参数为False
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_mv3_db/"  --rec_model_dir="./inference/ch_rec_mv3_crnn/" --use_gpu=False

通用中文OCR模型的体验可以按照上述步骤下载相应的模型，并且更新相关的参数，示例如下：

# 预测image_dir指定的单张图像
python3 tools/infer/predict_system.py --image_dir="./doc/imgs/11.jpg" --det_model_dir="./inference/ch_det_r50_vd_db/"  --rec_model_dir="./inference/ch_rec_r34_vd_crnn/"

更多的文本检测、识别串联推理使用方式请参考文档教程中基于预测引擎推理。

文档教程

文本检测算法

PaddleOCR开源的文本检测算法列表：

EAST(paper)
DB(paper)
SAST(paper)(百度自研, comming soon)

在ICDAR2015文本检测公开数据集上，算法效果如下：

模型	骨干网络	precision	recall	Hmean	下载链接
EAST	ResNet50_vd	88.18%	85.51%	86.82%	下载链接
EAST	MobileNetV3	81.67%	79.83%	80.74%	下载链接
DB	ResNet50_vd	83.79%	80.65%	82.19%	下载链接
DB	MobileNetV3	75.92%	73.18%	74.53%	下载链接

使用LSVT街景数据集共3w张数据，训练中文检测模型的相关配置和预训练文件如下：

模型	骨干网络	配置文件	预训练模型
超轻量中文模型	MobileNetV3	det_mv3_db.yml	下载链接
通用中文OCR模型	ResNet50_vd	det_r50_vd_db.yml	下载链接

注：上述DB模型的训练和评估，需设置后处理参数box_thresh=0.6，unclip_ratio=1.5，使用不同数据集、不同模型训练，可调整这两个参数进行优化

PaddleOCR文本检测算法的训练和使用请参考文档教程中文本检测模型训练/评估/预测。

文本识别算法

PaddleOCR开源的文本识别算法列表：

参考DTRB文字识别训练和评估流程，使用MJSynth和SynthText两个文字识别数据集训练，在IIIT, SVT, IC03, IC13, IC15, SVTP, CUTE数据集上进行评估，算法效果如下：

模型	骨干网络	Avg Accuracy	模型存储命名	下载链接
Rosetta	Resnet34_vd	80.24%	rec_r34_vd_none_none_ctc	下载链接
Rosetta	MobileNetV3	78.16%	rec_mv3_none_none_ctc	下载链接
CRNN	Resnet34_vd	82.20%	rec_r34_vd_none_bilstm_ctc	下载链接
CRNN	MobileNetV3	79.37%	rec_mv3_none_bilstm_ctc	下载链接
STAR-Net	Resnet34_vd	83.93%	rec_r34_vd_tps_bilstm_ctc	下载链接
STAR-Net	MobileNetV3	81.56%	rec_mv3_tps_bilstm_ctc	下载链接
RARE	Resnet34_vd	84.90%	rec_r34_vd_tps_bilstm_attn	下载链接
RARE	MobileNetV3	83.32%	rec_mv3_tps_bilstm_attn	下载链接

使用LSVT街景数据集根据真值将图crop出来30w数据，进行位置校准。此外基于LSVT语料生成500w合成数据训练中文模型，相关配置和预训练文件如下：

模型	骨干网络	配置文件	预训练模型
超轻量中文模型	MobileNetV3	rec_chinese_lite_train.yml	下载链接
通用中文OCR模型	Resnet34_vd	rec_chinese_common_train.yml	下载链接

PaddleOCR文本识别算法的训练和使用请参考文档教程中文本识别模型训练/评估/预测。

端到端OCR算法

End2End-PSL(百度自研, comming soon)

超轻量级中文OCR效果展示

通用中文OCR效果展示

FAQ

转换attention识别模型时报错：KeyError: 'predict'
问题已解，请更新到最新代码。
关于推理速度
图片中的文字较多时，预测时间会增，可以使用--rec_batch_num设置更小预测batch num，默认值为30，可以改为10或其他数值。
服务部署与移动端部署
预计6月中下旬会先后发布基于Serving的服务部署方案和基于Paddle Lite的移动端部署方案，欢迎持续关注。
自研算法发布时间
自研算法SAST、SRN、End2End-PSL都将在6-7月陆续发布，敬请期待。

more

欢迎加入PaddleOCR技术交流群

扫描二维码或者加微信：paddlehelp，备注OCR，小助手拉你进群～

参考文献

1. EAST:
@inproceedings{zhou2017east,
  title={EAST: an efficient and accurate scene text detector},
  author={Zhou, Xinyu and Yao, Cong and Wen, He and Wang, Yuzhi and Zhou, Shuchang and He, Weiran and Liang, Jiajun},
  booktitle={Proceedings of the IEEE conference on Computer Vision and Pattern Recognition},
  pages={5551--5560},
  year={2017}
}

2. DB:
@article{liao2019real,
  title={Real-time Scene Text Detection with Differentiable Binarization},
  author={Liao, Minghui and Wan, Zhaoyi and Yao, Cong and Chen, Kai and Bai, Xiang},
  journal={arXiv preprint arXiv:1911.08947},
  year={2019}
}

3. DTRB:
@inproceedings{baek2019wrong,
  title={What is wrong with scene text recognition model comparisons? dataset and model analysis},
  author={Baek, Jeonghun and Kim, Geewook and Lee, Junyeop and Park, Sungrae and Han, Dongyoon and Yun, Sangdoo and Oh, Seong Joon and Lee, Hwalsuk},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={4715--4723},
  year={2019}
}

4. SAST:
@inproceedings{wang2019single,
  title={A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning},
  author={Wang, Pengfei and Zhang, Chengquan and Qi, Fei and Huang, Zuming and En, Mengyi and Han, Junyu and Liu, Jingtuo and Ding, Errui and Shi, Guangming},
  booktitle={Proceedings of the 27th ACM International Conference on Multimedia},
  pages={1277--1285},
  year={2019}
}

5. SRN:
@article{yu2020towards,
  title={Towards Accurate Scene Text Recognition with Semantic Reasoning Networks},
  author={Yu, Deli and Li, Xuan and Zhang, Chengquan and Han, Junyu and Liu, Jingtuo and Ding, Errui},
  journal={arXiv preprint arXiv:2003.12294},
  year={2020}
}

6. end2end-psl:
@inproceedings{sun2019chinese,
  title={Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning},
  author={Sun, Yipeng and Liu, Jiaming and Liu, Wei and Han, Junyu and Ding, Errui and Liu, Jingtuo},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={9086--9095},
  year={2019}
}

许可证书

本项目的发布受Apache 2.0 license许可认证。

贡献代码

我们非常欢迎你为PaddleOCR贡献代码，也十分感谢你的反馈。

非常感谢 Khanh Tran 贡献了英文文档。
非常感谢 zhangxin(Blog) 贡献新的可视化方式、添加.gitgnore、处理手动设置PYTHONPATH环境变量的问题

Name		Name	Last commit message	Last commit date
Latest commit History 420 Commits
configs		configs
doc		doc
ppocr		ppocr
tools		tools
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.style.yapf		.style.yapf
LICENSE		LICENSE
README.md		README.md
README_en.md		README_en.md
requirments.txt		requirments.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

简介

特性

支持的中文模型列表:

超轻量级中文OCR以及通用中文OCR体验

1.环境配置

2.inference模型下载

(1)超轻量级中文OCR模型下载

(2)通用中文OCR模型下载

3.单张图像或者图像集合预测

文档教程

文本检测算法

文本识别算法

端到端OCR算法

超轻量级中文OCR效果展示

通用中文OCR效果展示

FAQ

欢迎加入PaddleOCR技术交流群

参考文献

许可证书

贡献代码

About

Releases

Packages

Languages

License

funykatebird/PaddleOCR

Folders and files

Latest commit

History

Repository files navigation

简介

特性

支持的中文模型列表:

超轻量级中文OCR以及通用中文OCR体验

1.环境配置

2.inference模型下载

(1)超轻量级中文OCR模型下载

(2)通用中文OCR模型下载

3.单张图像或者图像集合预测

文档教程

文本检测算法

文本识别算法

端到端OCR算法

超轻量级中文OCR效果展示

通用中文OCR效果展示

FAQ

欢迎加入PaddleOCR技术交流群

参考文献

许可证书

贡献代码

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages