Stars
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
A comprehensive collection of IQA papers
ACM Multimedia 2023: DocDiff: Document Enhancement via Residual Diffusion Models. Also contains 1597 red seals in Chinese scenes, along with their corresponding binary masks.
A comprehensive list of awesome document image rectification papers.
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
Implementation of handwriting generation with use of recurrent neural networks in tensorflow. Based on Alex Graves paper (https://arxiv.org/abs/1308.0850).
A synthetic data generator for text recognition
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
This repository contains demos I made with the Transformers library by HuggingFace.
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Generates CGI sample receipts for use in receipt scanning CV automated tests
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Code for AAAI 2023 Paper : “Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models”
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
The official code for “Deep Unrestricted Document Image Rectification”, TMM, 2023.
The official code for “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction”, ACM MM, Oral Paper, 2021.
Synthesize distorted document image and control points.
This is a pytorch implementation of DocUNet: Document Image Unwarping via A Stacked U-Net
Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)
A selectional auto-encoder approach for document image binarization
Unofficial implementation of DocMAE (WIP): Document Image Rectification via Self-supervised Representation Learning
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…
DocEnTr: An end-to-end document image enhancement transformer - ICPR 2022
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch