Stars
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Janus-Series: Unified Multimodal Understanding and Generation Models
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero
Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
Conversion between Traditional and Simplified Chinese
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
Agno is a lightweight framework for building multi-modal Agents
Python tool for converting files and office documents to Markdown.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Official implementation of Character Region Awareness for Text Detection (CRAFT)
[CVPR 2022] Aesthetic Text Logo Synthesis via Content-aware Layout Inferring
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Utilities intended for use with Llama models.
Detect and extract tables to markdown and csv
A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.
😘 让你“爱”上 GitHub,解决访问时图裂、加载慢的问题。(无需安装)
python爬虫教程系列、从0到1学习python爬虫,包括浏览器抓包,手机APP抓包,如 fiddler、mitmproxy,各种爬虫涉及的模块的使用,如:requests、beautifulSoup、selenium、appium、scrapy等,以及IP代理,验证码识别,Mysql,MongoDB数据库的python使用,多线程多进程爬虫的使用,css 爬虫加密逆向破解,JS爬虫逆向,…