Skip to content
View mengguanzhou's full-sized avatar

Block or report mengguanzhou

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
27 stars written in Python
Clear filter

Robust Speech Recognition via Large-Scale Weak Supervision

Python 73,599 8,798 Updated Dec 1, 2024

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 40,924 5,237 Updated Jun 27, 2024

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 35,569 5,218 Updated Nov 15, 2024

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 13,113 1,412 Updated Jan 1, 2025

[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Python 12,149 2,261 Updated Jun 26, 2024

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…

Python 11,333 1,865 Updated Dec 31, 2024

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

Python 11,062 2,341 Updated Nov 26, 2024

A TensorFlow implementation of DeepMind's WaveNet paper

Python 5,422 1,293 Updated Jul 12, 2023

Real time interactive streaming digital human

Python 4,226 616 Updated Dec 29, 2024

Multilingual Voice Understanding Model

Python 3,873 344 Updated Nov 29, 2024

😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)

Python 3,868 816 Updated Jul 5, 2024

使用机器学习算法完成对12306验证码的自动识别

Python 2,895 738 Updated Mar 4, 2021

use cnn recognize captcha by tensorflow. 本项目针对字符型图片验证码,使用tensorflow实现卷积神经网络,进行验证码识别。

Python 2,795 784 Updated Dec 8, 2022

Real time transcription with OpenAI Whisper.

Python 2,474 416 Updated Jun 1, 2024

WaveNet vocoder

Python 2,333 500 Updated Jul 29, 2023

VirtualWife是一个虚拟数字人项目,支持B站直播,支持openai、ollama

Python 2,127 324 Updated Oct 27, 2024

中文语音识别; Mandarin Automatic Speech Recognition;

Python 1,895 482 Updated Jul 25, 2024

[ICCV'23] Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis

Python 1,104 140 Updated Jul 12, 2024

Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"

Python 899 72 Updated Sep 13, 2024

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

Python 543 108 Updated Jun 10, 2023

知乎爬虫(验证码自动识别)

Python 535 147 Updated Jul 15, 2018

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成,支持多语言,准确率高

Python 483 89 Updated Dec 4, 2024

Whispering Tiger - OpenAI's whisper (and other models) with OSC and Websocket support. Allowing live transcription / translation in VRChat and Overlays in most Streaming Applications

Python 405 31 Updated Dec 24, 2024

使用python进行语音识别

Python 144 541 Updated Feb 16, 2022

The API server version of the SadTalker project. Runs in Docker, 10 times faster than the original!

Python 127 22 Updated Aug 2, 2023

Self hosted high quality voice recognition for de-googled Android using whisper. Like Siri or OK Google.

Python 55 7 Updated Dec 30, 2023