Highlights
- Pro
Stars
Synthetic Document Generator for document cleanup and annotation free layout analysis
A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
PaddlePaddle Developer Community
UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inputs, making it easy to integrate both visual understanding and image gener…
Witness the aha moment of VLM with less than $3.
I trained detection and recognition model using MMOCR, and then integrated it with SER Model trained using HuggingFace
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Python tool for converting files and office documents to Markdown.
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
All-in-One Development Tool based on PaddlePaddle(飞桨低代码开发工具)
A Comprehensive Benchmark for Document Parsing and Evaluation
Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
🎨 数学公式识别增强版:中英文手写印刷公式、支持初级符号推导(数据结构基于 LaTeX 抽象语法树)Math Formula OCR Pro, supports handwrite, Chinese-mixed formulas and simple symbol reasoning (based on LaTeX AST).
demonstrate how to use vision encoder decoder model
The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
A high-throughput and memory-efficient inference and serving engine for LLMs
DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models
Implementation of Nougat Neural Optical Understanding for Academic Documents
End-to-End Object Detection with Transformers
A Comprehensive Toolkit for High-Quality PDF Content Extraction
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。