This software implements trunsductive learning for unsupervised handwritten character recognition. It includes:
- A projective based unsupervised line segmentation algorithm
- Synthetic text generation and data augmentation for HCR training
- CRNN implementation for HCR
- Implementation of three method for transductive learning for HCR: CycleGan, DANN and VAT
The repository also includes a new test set containing 167 transcribed images of \emph{bKa’ gdams gsung ’bum} collection.
The software was tested on this collection and shows promising results.
The software has only been tested on Ubuntu 16.04 (x64). CUDA-enabled GPUs are required. Tested with Cuda 8.0 and Cudnn 7.0.5
- Install Tibetan fonts:
mkdir ~/.fonts
cp extra/Fonts/* ~/.fonts
sudo fc-cache -fv
- Change language settings to allow ASCII text reading:
sudo update-locale LANG=en_US.UTF-8
- Install pre-requisits
sudo apt-get install cairo-dock
sudo apt-get install pango1.0-tests
sudo apt-get install gtk2.0
sudo add-apt-repository ppa:glasen/freetype2
sudo apt update && sudo apt install freetype2-demos
- Compile cpp text rendering program:
cd extra/TextRender/c_code
make
cd ../bin
chmod u+x main
- Check that both installation and compilation worked correctly:
- to test font installation run:
bin/main_show_fonts | grep Shangshung
bin/main_show_fonts | grep Qomolangma
You should see these four font families:
- Shangshung Sgoba-KhraChung
- Shangshung Sgoba-KhraChen
- Qomolangma-Betsu
- Qomolangma-Betsu
- to test compilation run:
bash test.sh
cd <base project dir>
conda create -n tibetan_hcr python=3.6
source activate tibetan_hcr
pip install -r requirements.txt
cd Src/utils/warp-ctc
mkdir build; cd build
cmake ..
make
cd ../pytorch_binding
python setup.py install
cd ../..
export CFLAGS='-Wall -Wextra -std=c99 -I/usr/local/cuda-8.0/include'
git clone --recursive https://github.com/parlance/ctcdecode.git
cd ctcdecode
pip install .
The following image should be created in Src/data/extra/TextRender/test:
There are three parts to data preparation:
- Using unsupervised line segmentation to separate test images to lines
- Rerdering synthetic multi line images and separating lines using line segmentation
- Creating a character lexicon from both training and testing datasets. We provide two ways to get data fom training and testing:
- downloding prepared data
- Instructions and code to prepare data
- Get prepared synthetic data:
- Download Prepared Data (google drive link) to Data/Synthetic
- untar file:
tar -xzvf synth_prepared.tar.gz
- Get prepared test data-set:
- Download Prepared Data (google drive link to Data/Test
- untar file:
tar -xzvf test_prepared.tar.gz
- Get prepared character lexicon:
- Download character lexicon file (google drive link to Data
- Segment test images to lines & create dataset file:
- Download test images (google drive link)) to Data/Test
- Untar file:
tar -xzvf test_original.tar.gz
- Segment images to line and create a dataset file containing line image to text tuples:
cd Src/data_preperation python 1_prepare_orig_images.py
- Create synthetic images and dataset file:
- Download texts to synthesize train images (google drive link) to Data/Synthetic
- Create synthetic images and a dataset file containing line image to text tuples:
cd Src/data_preperation python 2_prepare_synth_images.py
- Create character lexicon for both synthetic and original data:
cd Src/data_preperation
python 3_create_class_dict.py
Now that you have the dataset prepared and all the prerequisits installed, you can run CRNN trainig and testing. To do so go to base_project_dir/Src.
- Transductive VAT
python train.py --do-test-vat True --vat-epsilon 0.5 --vat-xi 1e-6 --vat-sign True --vat-ratio 10. \
--output-dir '../Output/transductive_vat' --do-lr-step True
- Transductive Adversarial Domain Adaptation
python train.py --do-ada-lr --ada-after-rnn True --ada-before-rnn True --rnn-hidden-size 256 --do-ada-lr True \
--do-ema True --do-beam-search True --ada-ratio 10. --output-dir '../Output/dann_cnn_rnn'
- Network consensus self-supervision
python train_multinet.py --max-iter 60000 --do-test-ensemble True --test-ensemble-ratio 10. \
--test-ensemble-thresh 2. --output-dir '../Output/multinet_self_supervision'
To train CycleGAN on tibetan data do the following:
- run the following from console
mkdir (base_dir)/CycleData
cd (base_dir)/Src/CycleGan
conda create -n cycle_tibetan python=3.5
source activate cycle_tibetan
pip install -r requirements.txt
python train.py --dataroot /media/data2/sivankeret/cycle_tibetan --name tibetan_cycle_identity_2 --model cycle_gan --no_dropout --resize_or_crop resize_and_crop --fineSize 64 --loadSize 98 --lambda_identity 2 --checkpoints_dir /media/data2/sivankeret/Runs/cycle_gan
- split synthetic train data to 90% train and 10% validataion images and put the in directories (base_dir)/CycleData/trainA and (base_dir)/CycleData/valA accordingly.
- split original data to 90% train and 10% validataion images and put the in directories (base_dir)/CycleData/trainB and (base_dir)/CycleData/valB accordingly.
- Train CycleGan by running the following line from (base_dir)/Src/CycleGan:
python train.py --dataroot ../../CycleData --name TibetanCycle --model cycle_gan --no_dropout --resize_or_crop resize_and_crop --fineSize 64 --loadSize 98 --lambda_identity 2 --checkpoints_dir ../../CycleModel
- To infer CycleGan on synthetic data do:
python test.py --dataroot ../../CycleData \
--name TibetanCycle --model cycle_gan --phase test --no_dropout \
--which_epoch 50000 --results_dir ../../CycleResults/Synth \
--checkpoints_dir ../../CycleModel \
--resize_or_crop only_resize \
--loadSize 64 --how_many 600000 --show_by B --only_one
Now you can use synthetic data mapped by CycleGan for training.
- To infer CycleGan on test data do:
python test.py --dataroot ../../CycleData \
--name TibetanCycle --model cycle_gan --phase test --no_dropout \
--which_epoch 50000 --results_dir ../../CycleResults/Orig \
--checkpoints_dir ../../CycleModel \
--resize_or_crop only_resize \
--loadSize 64 --how_many 600000 --which_direction B2A --only_one
Now you can infer CRNN on test data after CycleGan mapping. Notice that CycleGAN implementation in this directory is a copy of the following directory with slight changes: pytorch cycle gan The changes are programmatic changes to allow for easier text inference.
If you would like to run tests on pretrained models, you can download the following models:
- CRNN model trained using transductivate vat google drive link
- CRNN model trained using adversarial domain adaptation google drive link Please download the models to the directory: (cur_project)/PreTrainedModels (you should first create the directory).
To run model on images of test data after mapping by cycle GAN you can download the transformed data from: google drive link
To run test on a pretrained single network model, simply run:
cd (base project dir)/Src
python test.py --snapshot (path to snapshot file) --data-path (path to dataset file) --base-data-dir (path to images dir)
To run test on a pretrained multi network model, run:
cd (base project dir)/Src
python test_multinet.py --snapshot (path to snapshot file) --data-path (path to dataset file) --base-data-dir (path to images dir)
This project is licensed under the MIT License - see the LICENSE.md file for details.
Please cite the following paper if you are using the code/model in your research paper:
"Transductive Learning for Reading Handwritten Tibetan Manuscripts"