WaveRNN Server

Web server wrapper for WaveRNN by fatchord. Start with python server.py and POST text to localhost:5700 to generate .wav-files.

WaveRNN

Pytorch implementation of Deepmind's WaveRNN model from Efficient Neural Audio Synthesis

Installation

Ensure you have:

Python >= 3.6
Pytorch 1 with CUDA

Then install the rest with pip:

pip install -r requirements.txt

How to Use

Quick Start

If you want to use TTS functionality immediately you can simply use:

python quick_start.py

This will generate everything in the default sentences.txt file and output to a new 'quick_start' folder where you can playback the wav files and take a look at the attention plots

You can also use that script to generate custom tts sentences and/or use '-u' to generate unbatched (better audio quality):

python quick_start.py -u --input_text "What will happen if I run this command?'

Training your own Models

Download the LJSpeech Dataset.

Edit hparams.py, point wav_path to your dataset and run:

python preprocess.py

or use preprocess.py --path to point directly to the dataset

Here's my recommendation on what order to run things:

1 - Train Tacotron with:

python train_tacotron.py

2 - You can leave that finish training or at any point you can use:

python train_tacotron.py --force_gta

this will force tactron to create a GTA dataset even if it hasn't finish training.

3 - Train WaveRNN with:

python train_wavernn.py --gta

NB: You can always just run train_wavernn.py without --gta if you're not interested in TTS.

4 - Generate Sentences with both models using:

python gen_tacotron.py

this will generate default sentences. If you want generate custom sentences you can use

python gen_tacotron.py --input_text "this is whatever you want it to be"

And finally, you can always use --help on any of those scripts to see what options are available :)

Samples

Can be found here.

Pretrained Models

Currently there are two pretrained models available in the /pretrained/ folder':

Both are trained on LJSpeech

WaveRNN trained to 800k steps (400k normal mels / 400k gta finetuned)
Tacotron(r=1) trained to 196k steps

Acknowledgments

Efficient Neural Audio Synthesis
keithito tacotron
Special thanks to github users G-Wang, geneing

Name		Name	Last commit message	Last commit date
Latest commit History 140 Commits
.vscode		.vscode
assets		assets
checkpoints		checkpoints
models		models
notebooks		notebooks
pretrained		pretrained
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
gen_tacotron.py		gen_tacotron.py
gen_wavernn.py		gen_wavernn.py
hparams.py		hparams.py
preprocess.py		preprocess.py
quick_start.py		quick_start.py
requirements.txt		requirements.txt
sentences.txt		sentences.txt
server.py		server.py
train_tacotron.py		train_tacotron.py
train_wavernn.py		train_wavernn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WaveRNN Server

WaveRNN

Installation

How to Use

Quick Start

Training your own Models

Samples

Pretrained Models

Acknowledgments

About

Releases

Packages

Languages

License

Dacrol/WaveRNN-server

Folders and files

Latest commit

History

Repository files navigation

WaveRNN Server

WaveRNN

Installation

How to Use

Quick Start

Training your own Models

Samples

Pretrained Models

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages