OleehyO

🐏

OleehyO

🐏

LLM | MLSys

26 followers · 34 following

Achievements

x2 x2

Achievements

x2 x2

Highlights

Stars

ocr

19 repositories

PaddlePaddle / PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…

Python 47,016 8,040 Updated Mar 6, 2025

tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)

C++ 65,072 9,720 Updated Feb 12, 2025

lukas-blecher / LaTeX-OCR

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 13,672 1,090 Updated Jan 18, 2025

facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 9,300 601 Updated Feb 21, 2025

sparkfish / augraphy

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

Python 390 48 Updated Feb 15, 2025

zacharywhitley / awesome-ocr

930 113 Updated Sep 14, 2024

Belval / TextRecognitionDataGenerator

A synthetic data generator for text recognition

Python 3,415 998 Updated Jul 18, 2024

clovaai / synthtiger

Official Implementation of SynthTIGER (Synthetic Text Image Generator), ICDAR 2021

Python 509 103 Updated Jun 14, 2024

sjvasquez / handwriting-synthesis

Handwriting Synthesis with RNNs ✏️

Python 4,449 614 Updated Jan 11, 2024

WenmuZhou / OCR_DataSet

收集并整理有关OCR的数据集并统一标注格式，以便实验需要

Python 897 194 Updated Nov 28, 2023

doc-analysis / DocBank

DocBank: A Benchmark Dataset for Document Layout Analysis

Python 598 72 Updated Aug 12, 2024

microsoft / ArxivFormula

This repo is used to release the ArxivFormula dataset.

Python 24 2 Updated Nov 12, 2024

opendatalab / UniMERNet

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

Python 276 27 Updated Dec 26, 2024

opendatalab / PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 6,939 469 Updated Jan 3, 2025

ZZZHANG-jx / Recommendations-Document-Image-Processing

This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadowing, dewarping, deblurring, binarization and so on.

218 12 Updated Feb 12, 2025

opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

Python 27,416 2,112 Updated Mar 4, 2025

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 7,077 622 Updated Feb 10, 2025

VikParuchuri / marker

Convert PDF to markdown + JSON quickly with high accuracy

Python 21,769 1,341 Updated Mar 4, 2025

WGUNDERWOOD / tex-fmt

An extremely fast LaTeX formatter written in Rust

Rust 424 26 Updated Feb 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OleehyO

Achievements

Achievements

Highlights

Block or report OleehyO

ocr

PaddlePaddle / PaddleOCR

tesseract-ocr / tesseract

lukas-blecher / LaTeX-OCR

facebookresearch / nougat

sparkfish / augraphy

zacharywhitley / awesome-ocr

Belval / TextRecognitionDataGenerator

clovaai / synthtiger

sjvasquez / handwriting-synthesis

WenmuZhou / OCR_DataSet

doc-analysis / DocBank

microsoft / ArxivFormula

opendatalab / UniMERNet

opendatalab / PDF-Extract-Kit

ZZZHANG-jx / Recommendations-Document-Image-Processing

opendatalab / MinerU

Ucas-HaoranWei / GOT-OCR2.0

VikParuchuri / marker

WGUNDERWOOD / tex-fmt