Skip to content
View wwwei1997's full-sized avatar

Block or report wwwei1997

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

Audio/TTS

28 repositories

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 38,311 4,800 Updated Aug 16, 2024

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Python 7,205 1,314 Updated Dec 6, 2023

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

Python 1,967 558 Updated Oct 27, 2023

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,101 321 Updated Nov 14, 2023

🔊 Text-Prompted Generative Audio Model

Jupyter Notebook 37,162 4,390 Updated Aug 19, 2024

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Python 1,314 104 Updated Sep 24, 2023

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,820 776 Updated Feb 11, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,530 511 Updated Aug 10, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,655 673 Updated Mar 3, 2025

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,378 1,120 Updated Nov 14, 2024

vits2 backbone with multilingual-bert

Python 8,305 1,171 Updated Mar 10, 2025

The official implementation of HierSpeech++

Python 1,213 148 Updated Feb 20, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,624 2,265 Updated Jan 15, 2025

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 42,061 4,689 Updated Mar 5, 2025

Foundational model for human-like, expressive TTS

Python 4,060 677 Updated Jul 30, 2024

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook 8,173 782 Updated Jun 24, 2024

Inference and training library for high-quality TTS models.

Python 5,103 538 Updated Dec 10, 2024

SOTA Open Source TTS

Python 19,873 1,541 Updated Mar 3, 2025

A generative speech model for daily dialogue.

Python 35,003 3,780 Updated Feb 18, 2025

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,409 729 Updated May 2, 2023

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python 5,729 764 Updated Dec 24, 2024

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 11,736 1,164 Updated Mar 10, 2025

Instant voice cloning by MIT and MyShell. Audio foundation model.

Python 31,274 3,147 Updated Jan 7, 2025

TTS models for Arabic (Tacotron2, FastPitch)

Jupyter Notebook 107 28 Updated Nov 5, 2024

Deep learning for AR text Vocalization - التشكيل الالي للنصوص العربية

Python 341 44 Updated Mar 25, 2023

A book about Text-to-Speech (TTS) in Chinese.

TeX 596 80 Updated Apr 19, 2022

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 10,210 1,397 Updated Feb 24, 2025