Stars
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a se…
High-resolution models for human tasks.
Everything about ComfyUI, including workflow sharing, resource sharing, knowledge sharing, tutorial sharing, and more.关于ComfyUI的一切,工作流分享、资源分享、知识分享、教程分享等
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
Official Implementations for Paper - MagicQuill: An Intelligent Interactive Image Editing System
Industry leading face manipulation platform
Comprehensive Gradio WebUI for audio processing, powered by Whisper engines (Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer(RVC), zero-shot Voice Cloning (E2, F5-TTS), YouTub…
Official repository of In-Context LoRA for Diffusion Transformers
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Official inference repo for FLUX.1 models
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Instant voice cloning by MIT and MyShell. Audio foundation model.
OCR, layout analysis, reading order, table recognition in 90+ languages
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …
Samples for TensorRT/Deepstream for Tesla & Jetson
A natural language interface for computers