- Python3
- pyclipper
- Polygon2
- OpenCV
- TensorFlow 2.0+
(PSENet-tf2.0)Progressive Scale Expansion Network (PSENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene. Besides, based on this text segmentation model, we got top 6 in MTWI 2018 Text Detection Challenge
CUDA_VISIBLE_DEVICES=0 python train_ic15.py
CUDA_VISIBLE_DEVICES=0 python test_ic15.py --scale 1 --resume [path of model]
CUDA_VISIBLE_DEVICES=0 python train_id41k.py
CUDA_VISIBLE_DEVICES=0 python test_id41k.py --scale 1 --resume [path of model]
cd eval
sh eval_ic15.sh
sh eval_ctw1500.sh
Method | Extra Data | Precision (%) | Recall (%) | F-measure (%) | FPS (1080Ti) | Model |
---|---|---|---|---|---|---|
PSENet-1s (ResNet50) | - | 81.49 | 79.68 | 80.57 | 1.6 | baiduyun(extract code: rxti); OneDrive |
PSENet-1s (ResNet50) | pretrain on IC17 MLT | 86.92 | 84.5 | 85.69 | 1.6 | baiduyun(extract code: aieo); OneDrive |
PSENet-4s (ResNet50) | pretrain on IC17 MLT | 86.1 | 83.77 | 84.92 | 3.8 | baiduyun(extract code: aieo); OneDrive |
Method | Extra Data | Precision (%) | Recall (%) | F-measure (%) | FPS (1080Ti) | Model |
---|---|---|---|---|---|---|
PSENet-1s (ResNet50) | - | 80.57 | 75.55 | 78.0 | 3.9 | baiduyun(extract code: ksv7); OneDrive |
PSENet-1s (ResNet50) | pretrain on IC17 MLT | 84.84 | 79.73 | 82.2 | 3.9 | baiduyun(extract code: z7ac); OneDrive |
PSENet-4s (ResNet50) | pretrain on IC17 MLT | 82.09 | 77.84 | 79.9 | 8.4 | baiduyun(extract code: z7ac); OneDrive |
ICDAR 2015 (training with ICDAR 2017 MLT)
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-4s (ResNet152) | 87.98 | 83.87 | 85.88 |
PSENet-2s (ResNet152) | 89.30 | 85.22 | 87.21 |
PSENet-1s (ResNet152) | 88.71 | 85.51 | 87.08 |
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-4s (ResNet152) | 75.98 | 67.56 | 71.52 |
PSENet-2s (ResNet152) | 76.97 | 68.35 | 72.40 |
PSENet-1s (ResNet152) | 77.01 | 68.40 | 72.45 |
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-4s (ResNet152) | 80.49 | 78.13 | 79.29 |
PSENet-2s (ResNet152) | 81.95 | 79.30 | 80.60 |
PSENet-1s (ResNet152) | 82.50 | 79.89 | 81.17 |
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-1s (ResNet152) | 8.28 | 70.0 | 76 |
Figure 3: The results on ICDAR 2015, ICDAR 2017 MLT and SCUT-CTW1500
[new version paper] https://arxiv.org/abs/1903.12473
[old version paper] https://arxiv.org/abs/1806.02559
[pytorch version (thanks @WenmuZhou)] (https://github.com/WenmuZhou/PSENet.pytorch)
[tensorflow1.x version (thanks @liuheng92)] https://github.com/liuheng92/tensorflow_PSENet
laizhihui @ lzh
@inproceedings{wang2019shape,
title={Shape Robust Text Detection With Progressive Scale Expansion Network},
author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={9336--9345},
year={2019}
}