Python3 implementations of PSENet [1], PAN [2] and PAN++ [3] are released at https://github.com/whai362/pan_pp.pytorch.
[1] W. Wang, E. Xie, X. Li, W. Hou, T. Lu, G. Yu, and S. Shao. Shape robust text detection with progressive scale expansion network. In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 9336–9345, 2019.
[2] W. Wang, E. Xie, X. Song, Y. Zang, W. Wang, T. Lu, G. Yu, and C. Shen. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proc. IEEE Int. Conf. Comp. Vis., pages 8440–8449, 2019.
[3] Paper is in preparation.
- Python 2.7
- PyTorch v0.4.1+
- pyclipper
- Polygon2
- OpenCV 3.4 (for c++ version pse)
- opencv-python 3.4
Progressive Scale Expansion Network (PSENet) is a text detector which is able to well detect the arbitrary-shape text in natural scene.
CUDA_VISIBLE_DEVICES=0,1,2,3 python train_ic15.py
CUDA_VISIBLE_DEVICES=0 python test_ic15.py --scale 1 --resume [path of model]
cd eval
sh eval_ic15.sh
sh eval_ctw1500.sh
Method | Extra Data | Precision (%) | Recall (%) | F-measure (%) | FPS (1080Ti) | Model |
---|---|---|---|---|---|---|
PSENet-1s (ResNet50) | - | 81.49 | 79.68 | 80.57 | 1.6 | baiduyun(extract code: rxti); OneDrive |
PSENet-1s (ResNet50) | pretrain on IC17 MLT | 86.92 | 84.5 | 85.69 | 1.6 | baiduyun(extract code: aieo); OneDrive |
PSENet-4s (ResNet50) | pretrain on IC17 MLT | 86.1 | 83.77 | 84.92 | 3.8 | baiduyun(extract code: aieo); OneDrive |
Method | Extra Data | Precision (%) | Recall (%) | F-measure (%) | FPS (1080Ti) | Model |
---|---|---|---|---|---|---|
PSENet-1s (ResNet50) | - | 80.57 | 75.55 | 78.0 | 3.9 | baiduyun(extract code: ksv7); OneDrive |
PSENet-1s (ResNet50) | pretrain on IC17 MLT | 84.84 | 79.73 | 82.2 | 3.9 | baiduyun(extract code: z7ac); OneDrive |
PSENet-4s (ResNet50) | pretrain on IC17 MLT | 82.09 | 77.84 | 79.9 | 8.4 | baiduyun(extract code: z7ac); OneDrive |
ICDAR 2015 (training with ICDAR 2017 MLT)
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-4s (ResNet152) | 87.98 | 83.87 | 85.88 |
PSENet-2s (ResNet152) | 89.30 | 85.22 | 87.21 |
PSENet-1s (ResNet152) | 88.71 | 85.51 | 87.08 |
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-4s (ResNet152) | 75.98 | 67.56 | 71.52 |
PSENet-2s (ResNet152) | 76.97 | 68.35 | 72.40 |
PSENet-1s (ResNet152) | 77.01 | 68.40 | 72.45 |
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-4s (ResNet152) | 80.49 | 78.13 | 79.29 |
PSENet-2s (ResNet152) | 81.95 | 79.30 | 80.60 |
PSENet-1s (ResNet152) | 82.50 | 79.89 | 81.17 |
Method | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
PSENet-1s (ResNet152) | 78.5 | 72.1 | 75.2 |
Figure 3: The results on ICDAR 2015, ICDAR 2017 MLT and SCUT-CTW1500
[new version paper] https://arxiv.org/abs/1903.12473
[old version paper] https://arxiv.org/abs/1806.02559
[tensorflow version (thanks @liuheng92)] https://github.com/liuheng92/tensorflow_PSENet
@inproceedings{wang2019shape,
title={Shape Robust Text Detection With Progressive Scale Expansion Network},
author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={9336--9345},
year={2019}
}