Skip to content

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

License

Notifications You must be signed in to change notification settings

raccoonML/hifigan-demo

 
 

Repository files navigation

raccoonML hifigan demo

This repo adds a GUI to the awesome neural vocoder hifi-gan. This makes it easier to test quality of pretrained models. Only inference is supported. Please download a release for the best experience.

Demo makes use of my audiotools and voicebox projects. This is part of a larger project to integrate the hifigan vocoder with MLRTVC (see issue #20).

Announcements

Windows setup

  1. Install Python 3.7+ if you don't have it already. GUIDE: Installing Python on Windows.
  2. Download hifigan-demo.zip from the MLRTVC-v1 release.
  3. Extract the zip file.
  4. Create and activate a Python virtual environment. GUIDE: Python virtual environments in Windows
cd C:\path\to\hifigan-demo
python -m venv venv
venv\Scripts\activate.bat
  1. Install dependencies
pip install --upgrade pip
pip install torch -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
  1. Run the toolbox
python run_voicebox.py

Usage

Load an audio file. A spectrogram is created using the settings in hparams.py, and displayed in the top row. At the same time, Griffin-Lim is used to vocode the spectrogram back to audio. The spectrogram of the resulting audio is calculated and displayed in the middle row.

When the "vocode" button is pressed, hifigan inference is performed on the top spectrogram. The resulting audio from hifigan can be played back. A spectrogram of hifigan audio is displayed in the bottom row.

Screenshot

Credits

Thanks to my Patreon supporters for making this work possible. Learn how to support me here.

About

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%