Stars
A feature-rich command-line audio/video downloader
深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)
A collection of various awesome lists for hackers, pentesters and security researchers
Chart-to-Text: Generating Natural Language Explanations for Charts by Adapting the Transformer Model
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Golang PDF library for creating and processing PDF files (pure go)
🎨 Diagram as Code for prototyping cloud system architectures
BISHENG is an open LLM devops platform for next generation Enterprise AI applications. Powerful and comprehensive features include: GenAI workflow, RAG, Agent, Unified model management, Evaluation,…
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
Making large AI models cheaper, faster and more accessible
基于InternLm chat 7B大模型基座,构建一个Agent ,可以调用 MMYOLO 工具来完成图像内视觉任务
Defect-GLM:A Large Visual-Language Model for Industrial Defect Monitoring|首个用于工业缺陷监测的开源大规模视觉语言模型
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Collaboration with wangxupeng(https://github.com/wangxupeng)
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
A synthetic data generator for text recognition
A Unified Toolkit for Deep Learning Based Document Image Analysis
Extract the outline of the table from the paper form obtained from the photo and recognize the text content in the outline. 从拍照得到的纸质表格中检测出表格轮廓并提取出这些轮廓,对每个轮廓内的内容进行识别。
改进DB&RARE智慧表格分割识别系统