- Toulouse
- http://hmongouachon.com
Stars
A natural language interface for computers
Interact with your documents using the power of GPT, 100% privately, no data leaks
A generative speech model for daily dialogue.
Make websites accessible for AI agents
Instant voice cloning by MIT and MyShell. Audio foundation model.
aider is AI pair programming in your terminal
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Inference and training library for high-quality TTS models.
Open Source framework for voice and multimodal conversational AI
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web, vision.
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms
Follow along with my AI Agents Masterclass videos! All of the code I create and use in this series on YouTube will be here for you to use and even build on top of!
A modular voice assistant application for experimenting with state-of-the-art transcription, response generation, and text-to-speech models. Supports OpenAI, Groq, Elevanlabs, CartesiaAI, and Deepg…
Desktop AI Assistant powered by o1, o3-mini, GPT-4, GPT-4 Vision, Gemini, Claude, Llama 3, DeepSeek, Bielik, DALL-E, chat, vision, voice control, image generation and analysis, agents, command exec…
Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS Toolkit
AlwaysReddy is a LLM voice assistant that is always just a hotkey away.
An OpenAI API compatible text to speech server using Coqui AI's xtts_v2 and/or piper tts as the backend.
Sharing early versions of Ada, a personal AI Assistant built on OpenAIs Realtime API
Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs