Stars
Visualizing Cell Structures with Minecraft
Integrate LLM in any pipeline - fit/predict pattern, JSON driven flows, and built in concurency support.
Build datasets using natural language
A simple, hackable text-to-speech system in PyTorch and MLX
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms
NeMo text processing for ASR and TTS
This is an on-CPU real-time conversational system for two-way speech communication with AI models, utilizing a continuous streaming architecture for fluid conversations with immediate responses and…
Controllable and fast Text-to-Speech for over 7000 languages!
Leverage the OpenAI Realtime API (12-17-2024) with this Next.js 15 starter template featuring shadcn/ui components, tool-calling & localization. Use starter to build Voice AI apps with WebRTC.
Free UI components I use for building Expo Router apps
Fully open reproduction of DeepSeek-R1
Simple text to phones converter for multiple languages
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
An open source real-time AI inference engine for seamless scaling
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seaml…
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Everything about the SmolLM2 and SmolVLM family of models