Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 11,333 1,865 Updated Dec 31, 2024

Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Python 11,062 2,341 Updated Nov 26, 2024

ibab / tensorflow-wavenet

A TensorFlow implementation of DeepMind's WaveNet paper

Python 5,422 1,293 Updated Jul 12, 2023

lipku / LiveTalking

Real time interactive streaming digital human

Python 4,226 616 Updated Dec 29, 2024

FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model

Python 3,873 344 Updated Nov 29, 2024

TensorSpeech / TensorFlowTTS

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Python 3,868 816 Updated Jul 5, 2024

zhaipro / easy12306

使用机器学习算法完成对12306验证码的自动识别

Python 2,895 738 Updated Mar 4, 2021

nickliqian / cnn_captcha

use cnn recognize captcha by tensorflow. 本项目针对字符型图片验证码，使用tensorflow实现卷积神经网络，进行验证码识别。

Python 2,795 784 Updated Dec 8, 2022

davabase / whisper_real_time

Real time transcription with OpenAI Whisper.

Python 2,474 416 Updated Jun 1, 2024

r9y9 / wavenet_vocoder

WaveNet vocoder

Python 2,333 500 Updated Jul 29, 2023

yakami129 / VirtualWife

VirtualWife是一个虚拟数字人项目，支持B站直播，支持openai、ollama

Python 2,127 324 Updated Oct 27, 2024

nobody132 / masr

中文语音识别; Mandarin Automatic Speech Recognition;

Python 1,895 482 Updated Jul 25, 2024

Fictionarry / ER-NeRF

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

Python 1,104 140 Updated Jul 12, 2024

EricGuo5513 / momask-codes

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"

Python 899 72 Updated Sep 13, 2024

jackaduma / CycleGAN-VC2

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

Python 543 108 Updated Jun 10, 2023

moxiegushi / zhihu

知乎爬虫（验证码自动识别）

Python 535 147 Updated Jul 15, 2018

shibing624 / parrots

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成，支持多语言，准确率高

Python 483 89 Updated Dec 4, 2024

Sharrnah / whispering

Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications

Python 405 31 Updated Dec 24, 2024

xdcesc / my_ch_speech_recognition

使用python进行语音识别

Python 144 541 Updated Feb 16, 2022

kenwaytis / faster-SadTalker-API

The API server version of the SadTalker project. Runs in Docker, 10 times faster than the original!

Python 127 22 Updated Aug 2, 2023

rpdrewes / whisper-websocket-server

Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.

Python 55 7 Updated Dec 30, 2023

wanggang1987 / fast_sadtalker

Python 12 3 Updated Sep 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mengguanzhou

Block or report mengguanzhou

Stars

openai / whisper

THUDM / ChatGLM-6B

babysor / MockingBird

m-bain / whisperX

OpenTalker / SadTalker

PaddlePaddle / PaddleSpeech