yanggeng1995 / GAN-TTS Public

A pytroch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
images		images
models		models
samples		samples
utils		utils
README.md		README.md
generate.py		generate.py
process.py		process.py
train.py		train.py

Repository files navigation

GAN-TTS

A pytorch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS(https://arxiv.org/pdf/1909.11646.pdf)

Download dataset for training. This can be any wav files with sample rate 24000Hz.
Process: python process.py --wav_dir="wavs" --output="data"
Edit configuration in utils/audio.py

I did not use the loss function mentioned in the paper. I modified the loss function and learn from ParallelWaveGAN(https://arxiv.org/pdf/1910.11480.pdf).

This is not official implementation, some details are not necessarily correct.
The current results still have some noise, I suspect it is caused by the size of the batch.
Work in progress.