Stars
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Zero-Shot Speech Editing and Text-to-Speech in the Wild
Text to speech alignment using CTC forced alignment
Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".
Vector Quantized PPGs based Voice conversion
An open source accent conversion model based on the real time voice cloning repository
Unsupervised Speech Decomposition Via Triple Information Bottleneck
A diffusion model trained to generate musical pieces using the MIDI format. It introduces a pitch encoding based on frequencies, a multi-instruments-oriented model and a lightweight music-adapted v…
Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams (Interspeech'19)
A sequence-to-sequence voice conversion toolkit.
PPG based Voice Conversion Using Zero-Shot Learning
Zero-Shot Foreign Accent Conversion without a Native Reference
Pronunciation correction in vector quantized PPG representation space