Stars
Multi-Speaker Pytorch FastSpeech2: Fast and High-Quality End-to-End Text to Speech ✊
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
An unofficial PyTorch implementation of the audio LM VALL-E
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes
Python Implementation of Visual Relative Attributes for Image Classification and Zero Shot Learning
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏
A simple tool to easily use Montreal Forced Aligner. Also provide alignment(TextGrid) retrieved from ESD.
An unofficial PyTorch implementation of Mix-Phoneme-Bert
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch