Stars
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
AI ContentCraft is an all-in-one content creation suite that helps creators generate stories, podcast scripts, and multimedia content using AI-powered text generation, speech synthesis, and image g…
百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,时延低至800ms,低配置也可运行,支持打断
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
TEN Agent is a conversational AI powered by the TEN, integrating Gemini 2.0 Live, OpenAI Realtime, RTC, and more. It delivers real-time capabilities to see, hear, and speak, while being fully compa…
The official Python library for the OpenAI API
Link Android and PC easily! 全能手机连接助手!
A SDK to using the Realtime API with Microcontrollers like the ESP32
Arduino Audio Tools (a powerful Audio library not only for Arduino)
stock股票.获取股票数据,计算股票指标,筹码分布,识别股票形态,综合选股,选股策略,股票验证回测,股票自动交易,支持PC及移动设备。
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
Firefly, the easiest using and high performance WireGuard VPN server, plus version of wg-easy. 最简单易用的轻量级、高性能WireGuard服务端软件,可广泛用于异地组网、远程办公、内网穿透等场景。
AirLLM 70B inference with single 4GB GPU
A high-performance, zero-overhead, extensible Python compiler using LLVM
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Virtual whiteboard for sketching hand-drawn like diagrams
A file server that supports static serving, uploading, searching, accessing control, webdav...
library & platform to build, distribute, monetize ai apps that have the full context (like rewind, granola, etc.), open source, 100% local, developer friendly. 24/7 screen, mic, keyboard recording …