Stars
Multilingual Voice Understanding Model
Real time interactive streaming digital human
Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
The API server version of the SadTalker project. Runs in Docker, 10 times faster than the original!
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
VirtualWife是一个虚拟数字人项目,支持B站直播,支持openai、ollama
Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.
Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications
[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
Real time transcription with OpenAI Whisper.
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
A TensorFlow implementation of DeepMind's WaveNet paper
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのエディター
Robust Speech Recognition via Large-Scale Weak Supervision
Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成,支持多语言,准确率高
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
中文语音识别; Mandarin Automatic Speech Recognition;
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time