Stars
Joint speech-language model - respond directly to audio!
Meditron is a suite of open-source medical Large Language Models (LLMs).
Outbound Phone GPT is a sophisticated prototype for a context-aware agent designed to autonomously handle outbound phone calls.
A modified version of SalesGPT with the addition of TTS, STT, and Twilio to make calls. A Context-aware AI Sales Agent to automate sales outreach
Context-aware AI Sales Agent to automate sales outreach.
Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
Generative AI phone call toolkit using Twilio Media Streams.
Live transcription in Next.js by Deepgram
A high-throughput and memory-efficient inference and serving engine for LLMs
Collection of notebook guides created by the Brev.dev team!
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
A demo showing near real time streaming between twilio and elevenlabs
A python package to build AI-powered real-time audio applications
Instant voice cloning by MIT and MyShell. Audio foundation model.
Live-Transcription (STT) with Whisper PoC
Stable Diffusion web UI
A Gradio web UI for Large Language Models with support for multiple inference backends.
Zero-Shot Speech Editing and Text-to-Speech in the Wild
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Turn PDFs and EPUBs into audiobooks, subtitles or videos into dubbed videos (including translation), and more. For free. Pandrator uses local models, notably XTTS, including voice-cloning (instant,…