This is a pytorch-based implementation for paper ContourNet (CVPR2020). ContourNet is a contour-based text detector which represents text region with a set of contour points. This repository is built on the pytorch maskrcnn.
- Release code
- Document for Installation
- Trained models
- Document for testing and training
- Evaluation
- Experiment on more datasets
- re-organize and clean the parameters
2020/5/6 We upload the models on Drive.
2020/6/11 We update the experiment for CTW-1500 and further detail some training settings.
2020/12/1 We finished 8th in the Xunfei Competition with this repository using only single model testing.
2021/7/1 We won the 3rd Place in the [2021 ICDAR Competition](https://icdar2021.poli.br/) using the variant of Contournet.
We recommend you to use Anaconda BaiduYun(passward:1y3v) or Drive to manage your libraries.
conda create --name ContourNet
conda activate ContourNet
conda install ipython
pip install ninja yacs cython matplotlib tqdm scipy shapely networkx pandas
conda install pytorch=1.0 torchvision=0.2 cudatoolkit=9.0 -c pytorch
conda install -c menpo opencv
export INSTALL_DIR=$PWDcd $INSTALL_DIR
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
python setup.py build_ext install
cd $INSTALL_DIR
git clone https://github.com/wangyuxin87/ContourNet.git
cd ContourNet
python setup.py build develop
We use only official training images to train our model.
Dataset | Model | recall | precision | F-measure |
---|---|---|---|---|
ic15 | Paper | 86.1 | 87.6 | 86.9 |
ic15 | This implementation | 84.0 | 90.1 | 87.0 |
CTW-1500 | Paper | 84.1 | 83.7 | 83.9 |
CTW-1500 | This implementation | 84.0 | 85.7 | 84.8 |
Prepare data follow COCO format or you can download our IC15dataset BAIDU (passward:ect5) or Geogle Drive, and unzip it in
datasets/.
You need to modify maskrcnn_benchmark/config/paths_catalog.py
to point to the location where your dataset is stored.
Download ResNet50 model BAIDU(passward:edt8) or Drive and put it in ContourNet/
.
Put the folder in
output/.
Set the resolution to 1200x2000 in maskrcnn_benchmark/data/transformstransforms.py
(line 50 to 52). You can ignore this step when you train your own model, which seems to obtain better results. Then run
bash test_contour.sh
Put bo.json to ic15_evaluate/, then run
cd ic15_evaluate
conda deactivate
pip2 install polygon2
conda install zip
python2 eval_ic15
As mentioned in our paper, we only use offical training images to train our model, data augmentation includes random crop, rotate etc. There are 2 strategies to initialize the parameters in the backbone:1) use the ResNet50 model (ImageNet)BAIDU(passward:edt8) or Drive, this is provided by Yuliang, which is ONLY an ImageNet Model With a few iterations on ic15 training data for a stable initialization.2) Use model only pre-trained on ImageNet(modify the WEIGHT to catalog://ImageNetPretrained/MSRA/R-50
in config/ic15/r50_baseline.yaml
). In this repository, we use the first one to train the model on this dataset.
Run
bash train_contour.sh
Change the ROTATE_PROB_TRAIN to 0.3 and ROTATE_DEGREE to 10 in config/ic15/r50_baseline.yaml
(corresponding modification also needs to be done in maskrcnn_benchmark/data/transformstransforms.py
from line 312 to 317), then finetune the model for more 10500 steps (lr starts from 2.5e-4 and dot 0.1 when step = 5k,10k).
Prepare data follow COCO format or you can download our CTW-dataset Baidu(jeec)Drive, and unzip it in
output/.
You need to modify maskrcnn_benchmark/config/paths_catalog.py
to point to the location where your dataset is stored.
Put the folder in
output/.
Then run
bash test_contour.sh
Run
cd ctw_eval
python eval_ctw1500.py
Run
bash train_contour.sh
- We use different reconstruction algorithm to rebuild text region from contour points for curved text, you can reproduce our approach used in the paper by modifying the hyper-parameter in Alpha-Shape Algorithm (some tricks also should be added). Furthermore, more robust reconstruction algorithm may obtain better results.
- The detection results are not accurate when the proposal contains more than one text, because of that the strong response will be obtained in both contour regions of texts.
- Some morphological algorithms can make the contour line more smooth.
- More tricks like deformable_conv, deformable_pooling in the box_head, etc. can further improve the detection results.
If you find our method useful for your reserach, please cite
@inproceedings{wang2020contournet,
title={ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection},
author={Wang, Yuxin and Xie, Hongtao and Zha, Zheng-Jun and Xing, Mengting and Fu, Zilong and Zhang, Yongdong},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={11753--11762},
year={2020}
}
Suggestions and discussions are greatly welcome. Please contact the authors by sending email to [email protected]