Skip to content
Change the repository type filter

All

    Repositories list

    • 使用ai生成多章节的长篇小说,自动衔接上下文、伏笔
      Python
      GNU Affero General Public License v3.0
      145000Updated Feb 16, 2025Feb 16, 2025
    • A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
      Python
      Apache License 2.0
      699000Updated Feb 12, 2025Feb 12, 2025
    • esp-sr

      Public
      Speech recognition
      C
      Other
      115000Updated Feb 12, 2025Feb 12, 2025
    • goku

      Public
      Video Generation Foundation Models: https://saiyan-world.github.io/goku/
      Python
      234000Updated Feb 11, 2025Feb 11, 2025
    • RAG Web UI is an intelligent dialogue system based on RAG (Retrieval-Augmented Generation) technology.
      TypeScript
      Apache License 2.0
      170000Updated Feb 11, 2025Feb 11, 2025
    • InspireMusic: A Unified Framework for Music, Song, Audio Generation.
      Python
      Apache License 2.0
      71000Updated Feb 10, 2025Feb 10, 2025
    • Give Cursor Agent an AI Team and Advanced Skills
      TypeScript
      MIT License
      108000Updated Feb 10, 2025Feb 10, 2025
    • Sonic

      Public
      Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
      Python
      Other
      143000Updated Feb 10, 2025Feb 10, 2025
    • FireRedASR is a family of open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.
      Python
      Apache License 2.0
      37000Updated Feb 5, 2025Feb 5, 2025
    • (ICLR 2025) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
      Python
      MIT License
      19000Updated Feb 5, 2025Feb 5, 2025
    • Jupyter Notebook
      MIT License
      299000Updated Feb 3, 2025Feb 3, 2025
    • R1-V

      Public
      Witness the aha moment of VLM with less than $3.
      Python
      212000Updated Feb 3, 2025Feb 3, 2025
    • ai-gradio

      Public
      A Python package that makes it easy for developers to create AI apps powered by various AI providers.
      Python
      169000Updated Feb 1, 2025Feb 1, 2025
    • MILS

      Public
      Code release for "LLMs can see and hear without any training"
      Python
      Other
      16000Updated Jan 31, 2025Jan 31, 2025
    • RAGEN

      Public
      RAGEN is the first open-source reproduction of DeepSeek-R1 for training agentic models via reinforcement learning.
      Python
      Apache License 2.0
      64000Updated Jan 30, 2025Jan 30, 2025
    • YuE

      Public
      YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
      Python
      425000Updated Jan 30, 2025Jan 30, 2025
    • Demo showing how to use the OpenAI Realtime API to navigate a 3D scene via tool calling
      TypeScript
      MIT License
      48000Updated Jan 29, 2025Jan 29, 2025
    • Python
      GNU General Public License v3.0
      8000Updated Jan 29, 2025Jan 29, 2025
    • PengChengStarling is specifically designed for developing multilingual ASR models based on the icefall project, supporting a complete ASR pipeline that includes data processing, model training, inference, fine-tuning, and deployment.
      Python
      14000Updated Jan 27, 2025Jan 27, 2025
    • MIT License
      10k000Updated Jan 26, 2025Jan 26, 2025
    • bailing

      Public
      百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,时延低至800ms,低配置也可运行,支持打断
      Python
      MIT License
      107000Updated Jan 19, 2025Jan 19, 2025
    • TTS with kokoro and onnx runtime
      Python
      MIT License
      150000Updated Jan 15, 2025Jan 15, 2025
    • WrenAI

      Public
      🤖 Open-source AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, and BI. 📈📊📋🧑‍💻
      TypeScript
      GNU Affero General Public License v3.0
      558000Updated Jan 10, 2025Jan 10, 2025
    • Leverage the OpenAI Realtime API (12-17-2024) with this Next.js 15 starter template featuring shadcn/ui components, tool-calling & localization. Use starter to build Voice AI apps with WebRTC.
      TypeScript
      MIT License
      47000Updated Jan 10, 2025Jan 10, 2025
    • Build your own AI friend
      C
      MIT License
      996000Updated Jan 6, 2025Jan 6, 2025
    • VITA

      Public
      ✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
      Python
      Other
      158000Updated Jan 6, 2025Jan 6, 2025
    • Speech recognition module for Python, supporting several engines and APIs, online and offline.
      Python
      BSD 3-Clause "New" or "Revised" License
      2.4k000Updated Jan 4, 2025Jan 4, 2025
    • Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
      Python
      Apache License 2.0
      1.9k000Updated Jan 3, 2025Jan 3, 2025
    • TangoFlux

      Public
      TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
      Jupyter Notebook
      MIT License
      58000Updated Jan 2, 2025Jan 2, 2025
    • World's First Large-scale High-quality Robotic Manipulation Benchmark
      Jupyter Notebook
      MIT License
      91000Updated Dec 30, 2024Dec 30, 2024