Skip to content

A pytroch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

Notifications You must be signed in to change notification settings

tameszaza/GAN-TTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GAN-TTS

A pytorch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS(https://arxiv.org/pdf/1909.11646.pdf)

Prepare dataset

  • Download dataset for training. This can be any wav files with sample rate 24000Hz.
  • Edit configuration in utils/audio.py (hop_length must remain unchanged)
  • Process data: python process.py --wav_dir="wavs" --output="data"

Train & Tensorboard

  • python train.py --input="data/train"
  • tensorboard --logdir logdir

Inference

  • python generate.py --input="data/test"

Result

  • You can find the results in the samples directory.

Attention

Notes

  • This is not official implementation, some details are not necessarily correct.
  • The current results still have some noise, I suspect it is caused by the size of the batch.
  • Work in progress.

About

A pytroch implementation of the GAN-TTS: HIGH FIDELITY SPEECH SYNTHESIS WITH ADVERSARIAL NETWORKS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%