Skip to content
View Topdu's full-sized avatar

Highlights

  • Pro

Block or report Topdu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 9 Updated Jan 13, 2025

Synthetic Document Generator for document cleanup and annotation free layout analysis

Python 2 2 Updated Nov 21, 2023

A CPU Realtime VLM in 500M. Surpassed Moondream2 and SmolVLM. Training from scratch with ease.

Python 132 14 Updated Mar 2, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,317 583 Updated Mar 4, 2025

PaddlePaddle Developer Community

Jupyter Notebook 98 280 Updated Mar 3, 2025

UniToken is an auto-regressive generation model that combines discrete and continuous representations to process visual inputs, making it easy to integrate both visual understanding and image gener…

Python 30 2 Updated Feb 21, 2025

Witness the aha moment of VLM with less than $3.

Python 3,003 240 Updated Mar 1, 2025
Python 3 1 Updated Jan 24, 2025

I trained detection and recognition model using MMOCR, and then integrated it with SER Model trained using HuggingFace

2 Updated Jan 28, 2024

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 13,645 1,089 Updated Jan 18, 2025

Get your documents ready for gen AI

Python 23,233 1,343 Updated Mar 3, 2025

Python tool for converting files and office documents to Markdown.

HTML 39,427 1,824 Updated Mar 3, 2025

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,896 617 Updated May 31, 2024

All-in-One Development Tool based on PaddlePaddle(飞桨低代码开发工具)

Python 5,171 992 Updated Mar 4, 2025

A Comprehensive Benchmark for Document Parsing and Evaluation

Python 266 23 Updated Feb 25, 2025

OCR toolbox from Davar-Lab

Python 745 156 Updated Nov 16, 2023

Ascend PyTorch adapter (torch_npu). Mirror of https://gitee.com/ascend/pytorch

Python 305 18 Updated Mar 4, 2025

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 2,126 127 Updated Dec 24, 2024

🎨 数学公式识别增强版:中英文手写印刷公式、支持初级符号推导(数据结构基于 LaTeX 抽象语法树)Math Formula OCR Pro, supports handwrite, Chinese-mixed formulas and simple symbol reasoning (based on LaTeX AST).

Jupyter Notebook 1,186 238 Updated Jun 11, 2024

demonstrate how to use vision encoder decoder model

Python 2 Updated Dec 6, 2024

The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

Python 49 4 Updated Jun 14, 2024

📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.

Python 3,619 406 Updated Mar 4, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,143 6,012 Updated Mar 4, 2025

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Jupyter Notebook 128 5 Updated Jan 13, 2025

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 9,295 601 Updated Feb 21, 2025

End-to-End Object Detection with Transformers

Python 14,046 2,525 Updated Mar 12, 2024

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 6,910 468 Updated Jan 3, 2025

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 27,213 2,099 Updated Mar 3, 2025
Next