Stars
A sample market maker to demonstrate the usage of python-okx SDK
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Easily train a good VC model with voice data <= 10 mins!
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Industry leading face manipulation platform
Fay is an open-source digital human framework integrating language models and digital characters. It offers retail, assistant, and agent versions for diverse applications like virtual shopping guid…
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
kaldi-asr/kaldi is the official location of the Kaldi project.
speech-aligner,是一个从“人声语音”及其“语言文本”,产生音素级别时间对齐标注的工具。speech-aligner, is a tool that generate phoneme-level alignment between human speech and its transcription
The official Python API for ElevenLabs Text to Speech.
The official Python API for Revocalize AI voice synthesizer platform.
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
singing voice change based on whisper, and lora for singing voice clone
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
Core Engine of Singing Voice Conversion & Singing Voice Clone
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
ChatGPT Plus 共享方案。ChatGPT Plus / OpenAI API sharing solution.
C0untFloyd / bark-gui
Forked from suno-ai/bark🔊 Text-Prompted Generative Audio Model with Gradio