[MM'2024] PEneo, an effective algorithm for key-value pair extraction from form-like documents, designed for real-world applications.
Hierarchical Transformers for Knowledge Graph Embeddings (EMNLP 2021)
[Neurocomputing 2023] Relational Graph Transformer for Knowledge Graph Representation
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
Visual Object Tagging Tool: An electron app for building end to end Object Detection Models from Images and Videos.
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"
Fullstack app framework for web, desktop, mobile, and more.
SigNoz is an open-source observability platform native to OpenTelemetry with logs, traces and metrics in a single application. An open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open sour…
[ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction
CloudNativePG is a comprehensive platform designed to seamlessly manage PostgreSQL databases within Kubernetes environments, covering the entire operational lifecycle from initial deployment to ong…
A robust message queue system for Rust applications, designed as a Rust alternative to Celery.
This is an unofficial implementation to the EMNLP 2023 paper: Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction
This is the official repository of the EMNLP 2023 paper Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction.
This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding.
A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
A token-based rate limiter based on the leaky bucket algorithm.
Build smaller, faster, and more secure desktop and mobile applications with a web frontend.
pyright fork with various type checking improvements, improved vscode support and pylance features built into the language server