This repo adds a GUI to the awesome neural vocoder hifi-gan. This makes it easier to test quality of pretrained models. Only inference is supported. Please download a release for the best experience.
Demo makes use of my audiotools and voicebox projects. This is part of a larger project to integrate the hifigan vocoder with MLRTVC (see issue #20).
-
1/5/2022: Generator-v1, Generator-v2, and Generator-v3 released. These are compatible with the pretrained models provided by the hifigan authors.
-
1/4/2022: MLRTVC-v1 pretrained model released. This is compatible with the audio settings used in the RTVC and Multi-Language RTVC repos. The hifigan model is trained to only 150,000 steps at this time.
- Install Python 3.7+ if you don't have it already. GUIDE: Installing Python on Windows.
- Download hifigan-demo.zip from the MLRTVC-v1 release.
- Extract the zip file.
- Create and activate a Python virtual environment. GUIDE: Python virtual environments in Windows
cd C:\path\to\hifigan-demo
python -m venv venv
venv\Scripts\activate.bat
- Install dependencies
pip install --upgrade pip
pip install torch -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt
- Run the toolbox
python run_voicebox.py
Load an audio file. A spectrogram is created using the settings in hparams.py, and displayed in the top row. At the same time, Griffin-Lim is used to vocode the spectrogram back to audio. The spectrogram of the resulting audio is calculated and displayed in the middle row.
When the "vocode" button is pressed, hifigan inference is performed on the top spectrogram. The resulting audio from hifigan can be played back. A spectrogram of hifigan audio is displayed in the bottom row.
Thanks to my Patreon supporters for making this work possible. Learn how to support me here.