Skip to content

Latest commit

 

History

History
39 lines (24 loc) · 2.3 KB

README.md

File metadata and controls

39 lines (24 loc) · 2.3 KB

Overview

This directory contains a simple torch to gguf format conversion script for the Parler TTS Mini Model or the Parler TTS Large Model.

Please note that the model encoding pattern used here is extremely naive and subject to further development (especially in order to align its pattern with gguf patterns in llama.cpp ad whisper.cpp).

Requirements

In order to run the installation and conversion script you will need python3 and pip3 installed locally.

Installation

all requisite requirements can be installed via pip:

pip3 install -r requirements.txt 

Usage

The gguf conversion script can be run via the convert_parler_tts_to_gguf file locally like so:

python3 ./convert_parler_tts_to_gguf --save-path ./parler-tts-large.gguf --voice-prompt "female voice" --large-model

the command accepts --save-path which described where to save the gguf model file to, the flag --large-model which when passed encodes Parler-TTS-large (rather than mini), and --voice-prompt which is a sentence or statement that desribes how the model's voice should sound at generation time.

Voice Prompt

The Parler TTS model is trained to alter how it generates audio tokens via cross attending against a text prompt generated via google/flan-t5-large a T5-encoder model. In order to avoid this encoding step on the ggml side, this converter generates the prompt's associated hidden states ahead of time and encodes them directly into the gguf model file.

Conditional Voice Prompt

If you would like to alter the voice prompt used to generate with parler TTS on the fly you will need to prepare the text encoder model, a T5-encoder model, in the gguf format. This can be accomplished by running convert_t5_encoder_to_gguf from this directory:

python3 ./convert_t5_encoder_to_gguf --save-path ./t5-encoder-large.gguf --large-model

To use this model alongside the parler tts model see the cli readme for information on conditional generation.