Stars
A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
✨ The Next Gen Airtable Alternative: No-Code Postgres
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.
A curated list of modern Generative Artificial Intelligence projects and services
Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs,…
📚 Freely available programming books
python测试开发库 中文版(持续更新)及书籍下载。公众号:pythontesting
省市区县乡镇三级或四级城市数据,带拼音标注、坐标、行政区域边界范围;2025年01月14日最新采集,提供csv格式文件,支持在线转成多级联动js代码、通用json格式,提供软件转成shp、geojson、sql、导入数据库;带浏览器里面运行的js采集源码,综合了中华人民共和国民政部、国家统计局、高德地图、腾讯地图行政区划数据
Wiseflow is an agile information mining tool that extracts concise messages from various sources such as websites, WeChat official accounts, social platforms, etc. It automatically categorizes and …
Just the facts -- web page content extraction
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/Docker/Zotero
A realtime serving engine for Data-Intensive Generative AI Applications
Agent Framework / shim to use Pydantic with LLMs
Turns Data and AI algorithms into production-ready web applications in no time.
MinerU是一款开源的高质量PDF解析工具,基于深度学习技术,可自动提取PDF文档中的文字、表格、图片、公式等内容,并提供丰富的分析、统计、搜索等功能。 本项目为其提供一个简化版本的WebUI,方便用户上传PDF文件,并实时展示提取结果。
基于MinerU的桌面应用程序,MinerU是一款开源的高质量PDF解析工具,基于深度学习技术,可自动提取PDF文档中的文字、表格、图片、公式等内容,并提供丰富的分析、统计、搜索等功能。 本项目为其提供一个简化版本的WebUI,方便用户上传PDF文件,并实时展示提取结果。
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
python爬虫教程系列、从0到1学习python爬虫,包括浏览器抓包,手机APP抓包,如 fiddler、mitmproxy,各种爬虫涉及的模块的使用,如:requests、beautifulSoup、selenium、appium、scrapy等,以及IP代理,验证码识别,Mysql,MongoDB数据库的python使用,多线程多进程爬虫的使用,css 爬虫加密逆向破解,JS爬虫逆向,…
Scrapoxy is a super proxies manager that orchestrates all your proxies into one place, rather than spreading management across multiple scrapers. It manages IP rotation and fingerprinting, and smar…
🔥🔥超过1000本的计算机经典书籍、个人笔记资料以及本人在各平台发表文章中所涉及的资源等。书籍资源包括C/C++、Java、Python、Go语言、数据结构与算法、操作系统、后端架构、计算机系统知识、数据库、计算机网络、设计模式、前端、汇编以及校招社招各种面经~
Based on RapidOCR, extract the PDF content.
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…