In resynth.py
we showcase a simple demonstration of the audio resynthesis done via HuBERT-based discrete pseudo-units. The code closesly
follows the unit2speech module of GSLM.
Below is an example of running the script:
python resynth.py --input test_input.wav --output=test_output.wav --vocab_size=100 --decoder_steps=500
resynth.py
supports the following command-line arguments:
--dense_model_name
: name of the dense representation model to be used (suppported:hubert-base-ls960
andcpc-big-ll6k
);--input
: the input audio file (must have the sample rate of 16 KHz);--output
: the output file name;--vocab_size
: the size of the quantization vocabulary to be used (one of 50, 100, 200);--decoder_steps
: determines the maximal duration of the produces audio.