Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"

Python 211 19 Updated Jul 3, 2024

wonjune-kang / lvc-vc

End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions

Python 88 6 Updated Nov 6, 2023

abarankab / DDPM

PyTorch DDPM implementation

Python 717 109 Updated May 23, 2022

zhenye234 / CoMoSpeech

ACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model

Python 199 21 Updated Apr 26, 2024

lucidrains / naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Python 1,306 104 Updated Sep 24, 2023

espnet / espnet

End-to-End Speech Processing Toolkit

Python 8,751 2,211 Updated Feb 5, 2025

ymcui / Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT（中文BERT-wwm系列模型）

Python 9,799 1,393 Updated Jul 31, 2023

Helsinki-NLP / prosody

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominence from Text

Python 240 39 Updated Oct 30, 2019

quadrismegistus / prosodic

Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.

JavaScript 281 42 Updated Dec 10, 2024

Riroaki / Chinese-Rhythm-Predictor

基于随机森林和条件随机场的中文韵律预测模型

Python 28 5 Updated Jul 25, 2024

KevinWang676 / ChatGLM2-Voice-Cloning

Chat with any character you like: ChatGLM2+SadTalker+Voice Cloning | 和喜欢的角色沉浸式对话吧：ChatGLM2+声音克隆+视频对话

Python 598 92 Updated Aug 11, 2023

KevinWang676 / Bark-Voice-Cloning

Bark Voice Cloning and Voice Cloning for Chinese Speech

Jupyter Notebook 2,831 409 Updated Aug 8, 2024

JabuMlDev / Speaker-VGG-CCT

Official implementation of the paper "SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers, 2022"

Python 20 4 Updated Feb 17, 2023

CarlWangChina / SaMoye-SVC

dog-can-sing-song

Python 18 2 Updated Nov 1, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 15,328 1,443 Updated Feb 4, 2025

CSTR-Edinburgh / merlin

This is now the official location of the Merlin project.

Python 1,308 440 Updated Mar 3, 2020

Lukelluke / MCD-MEL-CEPSTRAL-DISTANCE-MCD-application

Mel cepstral distortion (MCD) computations in python. Use Merlin toolkit to convert .wav files to .gcm files. Work in all form of .wav files

Shell 20 3 Updated Sep 4, 2020

uthree / vits2p

modified VITS2 for pitch manipulation and quality

Python 9 2 Updated Sep 4, 2024

KevinWang676 / VITS2-Chinese

VITS2 for Chinese speech | 最新VITS2中文语音合成

Python 130 15 Updated Oct 26, 2023

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,430 645 Updated Feb 3, 2025

shinjiwlab / versa

Versatile Evaluation of Speech and Audio

Python 155 13 Updated Feb 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

liumingda

Block or report liumingda

Stars

phillipi / pix2pix

MasayaKawamura / MB-iSTFT-VITS

MusicTextSynaesthesia / MusicTextSynaesthesia

NVlabs / edm

drethage / speech-denoising-wavenet

AaronZ345 / TCSinger

AaronZ345 / StyleSinger

PlayVoice / VI-SVS

cnlinxi / book-text-to-speech

hayeong0 / Diff-HierVC