-
Gachon University
- Seoul, South Korea
-
03:03
(UTC +09:00) - https://killerwhale0917.tistory.com/
- https://orca0917.github.io/
Highlights
Lists (1)
Sort Name ascending (A-Z)
Stars
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch
[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
A Flow-based Generative Network for Speech Synthesis
A Generative Flow for Text-to-Speech via Monotonic Alignment Search
DE4E: Data Engineering for Everybody by Pseudo-Lab
Flowtron is an auto-regressive flow-based generative network for text to speech synthesis with control over speech variation and style transfer
Official repository of SepReformer for speech separation
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Implementation of the paper: Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks (INTERSPEECH 2021)
(ImageNet pretrained models) The official pytorch implemention of the TPAMI paper "Res2Net: A New Multi-scale Backbone Architecture"
Implementation of the paper: Replay and Synthetic Speech Detection with Res2Net architecture (ICASSP 2021) https://arxiv.org/abs/2010.15006
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
A deep learning lyrics-to-audio alignment system, generating synchronized lyrics from a song and its lyrics
Using temporal convolution to detect Audio Deepfakes
🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
👦 👧 Technical-Interview guidelines written for those who started studying programming. I wish you all the best. 👾
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
PyTorch implementation of Tacotron speech synthesis model.
A simple implementation of Principal Component Analysis (PCA) visualized using Fashion MNIST Dataset. Thanks to https://github.com/zalandoresearch/fashion-mnist for making the dataset.
Reconstruction and Compression of Color Images Using Principal Component Analysis (PCA) Algorithm
The python script show the image reconstructed using 200 principal components (out of 512).
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
This repository provides a multi-mode and multi-speaker expressive speech synthesis framework, including multi-attentive Tacotron, DurIAN, Non-attentive Tacotron, GST, VAE, GMVAE, and X-vectors for…
🔊 Text-Prompted Generative Audio Model
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Implementation of Korean FastSpeech2