Skip to content

Synthetic data generation and domain adaptation for paragraph level Tibetan handwritten recognition

Notifications You must be signed in to change notification settings

SivanKe/TransductiveHCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Transductive Learning for Reading Handwritten Tibetan Manuscripts

This software implements trunsductive learning for unsupervised handwritten character recognition. It includes:

  • A projective based unsupervised line segmentation algorithm
  • Synthetic text generation and data augmentation for HCR training
  • CRNN implementation for HCR
  • Implementation of three method for transductive learning for HCR: CycleGan, DANN and VAT

The repository also includes a new test set containing 167 transcribed images of \emph{bKa’ gdams gsung ’bum} collection.
The software was tested on this collection and shows promising results.

Prerequisites

The software has only been tested on Ubuntu 16.04 (x64). CUDA-enabled GPUs are required. Tested with Cuda 8.0 and Cudnn 7.0.5

Installation

Install text rendering software

  1. Install Tibetan fonts:
mkdir ~/.fonts
cp extra/Fonts/* ~/.fonts
sudo fc-cache -fv
  1. Change language settings to allow ASCII text reading:
sudo update-locale LANG=en_US.UTF-8
  1. Install pre-requisits
sudo apt-get install cairo-dock
sudo apt-get install pango1.0-tests
sudo apt-get install gtk2.0
sudo add-apt-repository ppa:glasen/freetype2
sudo apt update && sudo apt install freetype2-demos
  1. Compile cpp text rendering program:
cd extra/TextRender/c_code
make
cd ../bin
chmod u+x main
  1. Check that both installation and compilation worked correctly:
  • to test font installation run:
bin/main_show_fonts | grep Shangshung
bin/main_show_fonts | grep Qomolangma

You should see these four font families:

  • Shangshung Sgoba-KhraChung
  • Shangshung Sgoba-KhraChen
  • Qomolangma-Betsu
  • Qomolangma-Betsu
  • to test compilation run:
bash test.sh

Create python evironment and install dependencies

cd <base project dir>
conda create -n tibetan_hcr python=3.6
source activate tibetan_hcr
pip install -r requirements.txt
cd Src/utils/warp-ctc
mkdir build; cd build
cmake ..
make
cd ../pytorch_binding
python setup.py install
cd ../..
export CFLAGS='-Wall -Wextra -std=c99 -I/usr/local/cuda-8.0/include'
git clone --recursive https://github.com/parlance/ctcdecode.git
cd ctcdecode
pip install .

The following image should be created in Src/data/extra/TextRender/test: rendered test image

Data Preperation

There are three parts to data preparation:

  1. Using unsupervised line segmentation to separate test images to lines
  2. Rerdering synthetic multi line images and separating lines using line segmentation
  3. Creating a character lexicon from both training and testing datasets. We provide two ways to get data fom training and testing:
  4. downloding prepared data
  5. Instructions and code to prepare data

Downloading Prepared Validation and Syntesized Train Data

  1. Get prepared synthetic data:
    tar -xzvf synth_prepared.tar.gz
  2. Get prepared test data-set:
    tar -xzvf test_prepared.tar.gz
  3. Get prepared character lexicon:

Preparing Test and Train data

  1. Segment test images to lines & create dataset file:
    tar -xzvf test_original.tar.gz
    • Segment images to line and create a dataset file containing line image to text tuples:
    cd Src/data_preperation
    python 1_prepare_orig_images.py
  2. Create synthetic images and dataset file:
    • Download texts to synthesize train images (google drive link) to Data/Synthetic
    • Create synthetic images and a dataset file containing line image to text tuples:
    cd Src/data_preperation
    python 2_prepare_synth_images.py
  3. Create character lexicon for both synthetic and original data:
cd Src/data_preperation
python 3_create_class_dict.py

Training

Now that you have the dataset prepared and all the prerequisits installed, you can run CRNN trainig and testing. To do so go to base_project_dir/Src.

Training Options

  • Transductive VAT
python train.py --do-test-vat True --vat-epsilon 0.5 --vat-xi 1e-6 --vat-sign True --vat-ratio 10. \
--output-dir '../Output/transductive_vat' --do-lr-step True
  • Transductive Adversarial Domain Adaptation
python train.py --do-ada-lr --ada-after-rnn True --ada-before-rnn True --rnn-hidden-size 256 --do-ada-lr True \
--do-ema True --do-beam-search True --ada-ratio 10. --output-dir '../Output/dann_cnn_rnn' 
  • Network consensus self-supervision
python train_multinet.py --max-iter 60000 --do-test-ensemble True --test-ensemble-ratio 10. \
--test-ensemble-thresh 2. --output-dir '../Output/multinet_self_supervision' 

Training CycleGAN on Tibetan Data

To train CycleGAN on tibetan data do the following:

  • run the following from console
mkdir (base_dir)/CycleData
cd (base_dir)/Src/CycleGan
conda create -n cycle_tibetan python=3.5
source activate cycle_tibetan
pip install -r requirements.txt
python train.py --dataroot /media/data2/sivankeret/cycle_tibetan --name tibetan_cycle_identity_2 --model cycle_gan --no_dropout --resize_or_crop resize_and_crop --fineSize 64 --loadSize 98 --lambda_identity 2 --checkpoints_dir /media/data2/sivankeret/Runs/cycle_gan
  • split synthetic train data to 90% train and 10% validataion images and put the in directories (base_dir)/CycleData/trainA and (base_dir)/CycleData/valA accordingly.
  • split original data to 90% train and 10% validataion images and put the in directories (base_dir)/CycleData/trainB and (base_dir)/CycleData/valB accordingly.
  • Train CycleGan by running the following line from (base_dir)/Src/CycleGan:
python train.py --dataroot ../../CycleData --name TibetanCycle --model cycle_gan --no_dropout --resize_or_crop resize_and_crop --fineSize 64 --loadSize 98 --lambda_identity 2 --checkpoints_dir ../../CycleModel
  • To infer CycleGan on synthetic data do:
python test.py --dataroot ../../CycleData \
--name TibetanCycle --model cycle_gan --phase test --no_dropout \
--which_epoch 50000 --results_dir ../../CycleResults/Synth \
--checkpoints_dir ../../CycleModel \
--resize_or_crop only_resize \
--loadSize 64 --how_many 600000 --show_by B --only_one

Now you can use synthetic data mapped by CycleGan for training.

  • To infer CycleGan on test data do:
python test.py --dataroot ../../CycleData \
--name TibetanCycle --model cycle_gan --phase test --no_dropout \
--which_epoch 50000 --results_dir ../../CycleResults/Orig \
--checkpoints_dir ../../CycleModel \
--resize_or_crop only_resize \
--loadSize 64 --how_many 600000 --which_direction B2A --only_one

Now you can infer CRNN on test data after CycleGan mapping. Notice that CycleGAN implementation in this directory is a copy of the following directory with slight changes: pytorch cycle gan The changes are programmatic changes to allow for easier text inference.

Testing

Downloading Pretrained Models

If you would like to run tests on pretrained models, you can download the following models:

  • CRNN model trained using transductivate vat google drive link
  • CRNN model trained using adversarial domain adaptation google drive link Please download the models to the directory: (cur_project)/PreTrainedModels (you should first create the directory).

Downloading Test Dataset Images After CycleGAN Mapping

To run model on images of test data after mapping by cycle GAN you can download the transformed data from: google drive link

Running Test

To run test on a pretrained single network model, simply run:

cd (base project dir)/Src
python test.py --snapshot (path to snapshot file) --data-path (path to dataset file) --base-data-dir (path to images dir)

To run test on a pretrained multi network model, run:

cd (base project dir)/Src
python test_multinet.py --snapshot (path to snapshot file) --data-path (path to dataset file) --base-data-dir (path to images dir)

License

This project is licensed under the MIT License - see the LICENSE.md file for details.
Please cite the following paper if you are using the code/model in your research paper: "Transductive Learning for Reading Handwritten Tibetan Manuscripts"

About

Synthetic data generation and domain adaptation for paragraph level Tibetan handwritten recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published