Clone of the official HiFi-GAN implementation.
official demo page.
- Python >= 3.6
- Clone this repository.
- Install python requirements. Please refer requirements.txt
- Download and extract the LJ Speech dataset.
And move all wav files to
LJSpeech-1.1/wavs
python train.py --config config_v1.json
To train V2 or V3 Generator, replace config_v1.json
with config_v2.json
or config_v3.json
.
Checkpoints and copy of the configuration file are saved in cp_hifigan
directory by default.
You can change the path by adding --checkpoint_path
option.
Validation loss during training with V1 generator.
You can also use pretrained models we provide.
Download pretrained models
Details of each folder are as in follows:
Folder Name | Generator | Dataset | Fine-Tuned |
---|---|---|---|
LJ_V1 | V1 | LJSpeech | No |
LJ_V2 | V2 | LJSpeech | No |
LJ_V3 | V3 | LJSpeech | No |
LJ_FT_T2_V1 | V1 | LJSpeech | Yes (Tacotron2) |
LJ_FT_T2_V2 | V2 | LJSpeech | Yes (Tacotron2) |
LJ_FT_T2_V3 | V3 | LJSpeech | Yes (Tacotron2) |
VCTK_V1 | V1 | VCTK | No |
VCTK_V2 | V2 | VCTK | No |
VCTK_V3 | V3 | VCTK | No |
UNIVERSAL_V1 | V1 | Universal | No |
We provide the universal model with discriminator weights that can be used as a base for transfer learning to other datasets.
- Generate mel-spectrograms in numpy format using Tacotron2 with teacher-forcing.
The file name of the generated mel-spectrogram should match the audio file and the extension should be.npy
.
Example:Audio File : LJ001-0001.wav Mel-Spectrogram File : LJ001-0001.npy
- Create
ft_dataset
folder and copy the generated mel-spectrogram files into it. - Run the following command.
For other command line options, please refer to the training section.
python train.py --fine_tuning True --config config_v1.json
- Make
test_files
directory and copy wav files into the directory. - Run the following command.
python inference.py --checkpoint_file [generator checkpoint file path]
Generated wav files are saved in generated_files
by default.
You can change the path by adding --output_dir
option.
- Make
test_mel_files
directory and copy generated mel-spectrogram files into the directory.
You can generate mel-spectrograms using Tacotron2, Glow-TTS and so forth. - Run the following command.
python inference_e2e.py --checkpoint_file [generator checkpoint file path]
Generated wav files are saved in generated_files_from_mel
by default.
You can change the path by adding --output_dir
option.
We referred to WaveGlow, MelGAN and Tacotron2 to implement this.