Starred repositories
修正文档扭曲/模糊/阴影等情况,使用onnx模型简单轻量部署,未来持续跟进最新最好的文档矫正方案和模型,Correct document distortion using a lightweight ONNX model for easy deployment. We will continue to follow and integrate the latest and best docu…
aider is AI pair programming in your terminal
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
A simple screen parsing tool towards pure vision based GUI agent
Unofficial Bitwarden compatible server written in Rust, formerly known as bitwarden_rs
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
Open Source framework for voice and multimodal conversational AI
A collection of prompts, system prompts and LLM instructions
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
Vertex AI (GCP) Claude Proxy via Cloudflare workers
A set of beautifully-designed, accessible components and a code distribution platform. Works with your favorite frameworks. Open Source. Open Code.
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
⏩ Create, share, and use custom AI code assistants with our open-source IDE extensions and hub of models, rules, prompts, docs, and other building blocks
A generative speech model for daily dialogue.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
FastAPI plugin to enable SSO to most common providers (such as Facebook login, Google login and login via Microsoft Office 365 Account)
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
#1 Locally hosted web application that allows you to perform various operations on PDF files
LlamaIndex is the leading framework for building LLM-powered agents over your data.
A trivial programmatic Llama 3 jailbreak. Sorry Zuck!
A simple and fast backend API, based on Hono, that can search for relevant content on the internet using keywords and convert it into a format suitable for LLM processing. Supports deployment on Cl…