Starred repositories
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Official code implementation of Slow Perception:Let's Perceive Geometric Figures Step-by-step
This repository includes the official implementation of OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs.
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
💬 Ready-to-use, flexible RAG Chatbot. 基于大模型和 RAG 的知识库问答系统。
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
[AI Agent Application Development Framework] - 🚀 Build AI agent native application in very few code 💬 Easy to interact with AI agent in code using structure data and chained-calls syntax 🧩 Enhance …
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
A series of large language models trained from scratch by developers @01-ai
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
🎙️🤖Create, Customize and Talk to your AI Character/Companion in Realtime (All in One Codebase!). Have a natural seamless conversation with AI everywhere (mobile, web and terminal) using LLM OpenAI …
a state-of-the-art-level open visual language model | 多模态预训练模型
Create Customized Software using Natural Language Idea (through LLM-powered Multi-Agent Collaboration)
📷 EasyPhoto | Your Smart AI Photo Generator.
FaceChain is a deep-learning toolchain for generating your Digital-Twin.
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
Official Code for DragGAN (SIGGRAPH 2023)
Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistributi…
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…