Lists (1)
Sort Name descending (Z-A)
Stars
This repository is the official implementation of Disentangling Writer and Character Styles for Handwriting Generation (CVPR 2023)
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
Effortless data labeling with AI support from Segment Anything and other awesome models.
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker
DIGImend graphics tablet drivers for the Linux kernel
The official repo for “DocScanner: Robust Document Image Rectification with Progressive Learning”.
A paper list of some recent Transformer-based CV works.
Xplorer, a customizable, modern file manager
🇫🇷 Oh my tmux! My self-contained, pretty & versatile tmux configuration made with ❤️
The book every data scientist needs on their desk.
AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for…
(TPAMI 2024) A Survey on Open Vocabulary Learning
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
With one command, create a natural-sounding audiobook from a variety of input formats (epub, mobi, txt, PDF, HTML and more!)
A Multimodal Language Agent Framework for Smart Devices and More
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
You like pytorch? You like micrograd? You love tinygrad! ❤️
Collection of AWESOME vision-language models for vision tasks
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Continuation of Clash Verge - A Clash Meta GUI based on Tauri (Windows, MacOS, Linux)
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
GUI for marking bounded boxes of objects in images for training neural network YOLO
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
OpenMMLab Detection Toolbox and Benchmark
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.