Skip to content

jinho-choi123/Handwriting2LaTeX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LaTex Generator using PaLI model

Please read this paper to fully understand this project. Key Paper

Description

A tool that converts handwriting(InkML format) into LaTex.

Data preparation

MathWriting

Model Architecture

PaLI Model

PaLI model takes two input: Image and Token sequence

For Image, we are going to render the InkML format into (3, IMG_SIZE, IMG_SIZE). The color will represent the speed of the writing. If you want more information, please read the paper

For Token sequence, we apply data preprocessing to reduce the sequence length of ink-pixels, and normalize the data.

For Visual component, we are going to use ViT in huggingface.

For Language component, we are going to use mT5 in huggingface

Getting Started - Installing dependencies

Please use uv-python.

# check your python version is 3.11.10
$ python --version
Python 3.11.10

# generate venv and activate
$ uv venv
$ source .venv/bin/activate

# install requirments
$ uv pip install requirements.txt

Getting Started - Dataset installation

Use data-install.sh script. This will take 5~10 minutes to download+decompress the dataset.

$ ./data-install.sh

# check the installed data in data/mathwriting-2024 directory

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published