Stars
Image-to-image translation with conditional adversarial nets
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
A neural network for end-to-end speech denoising
PyTorch Implementation of TCSinger(EMNLP 2024): Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
PyTorch Implementation of StyleSinger(AAAI 2024): Style Transfer for Out-of-Domain Singing Voice Synthesis
Singing Voice Synthesis based on VITS, different from VISinger
A book about Text-to-Speech (TTS) in Chinese.
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Pre-Training with Whole Word Masking for Chinese BERT(中文BERT-wwm系列模型)
Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text
Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.
Chat with any character you like: ChatGLM2+SadTalker+Voice Cloning | 和喜欢的角色沉浸式对话吧:ChatGLM2+声音克隆+视频对话
Bark Voice Cloning and Voice Cloning for Chinese Speech
Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers, 2022"
Fast and memory-efficient exact attention
This is now the official location of the Merlin project.
Mel cepstral distortion (MCD) computations in python. Use Merlin toolkit to convert .wav files to .gcm files. Work in all form of .wav files
VITS2 for Chinese speech | 最新VITS2中文语音合成
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…