Recently I would like to build a Chinese OCR based on Google Attention ocr and found that if you want to train your own model, you need to generate FSNS format tfrecord yourself, but Google officials did not say in this part carefully, only to a stackoverflow link, but This link is also not clear,and have some mistakes, so I wrote a code to generate FSNS format (JPG / PNG) tfrecord . The format of FSNS said in this paper
You need to do three steps:
The first step is to create a dictionary that matches your own text label, such as dic.txt.
The second step is to place the pictures and text you need to generate under / data,
The third step is to generate your tfrecord by
python generate_tfrecord_JPG.py
or
python generate_tfrecord_PNG.py