Skip to content

Commit

Permalink
docs: update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
thinhlpg committed Apr 7, 2024
1 parent deab81e commit 089f2b3
Show file tree
Hide file tree
Showing 3 changed files with 54 additions and 1 deletion.
55 changes: 54 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,54 @@
# vixtts-demo
# viXTTS Demo

viXTTS is a text-to-speech voice generation tool that offers voice cloning voices in Vietnamese and other languages. This model is a fine-tuned version based on the [XTTS-v2.0.3](https://huggingface.co/coqui/XTTS-v2) model, utilizing the [viVoice](https://huggingface.co/datasets/capleaf/viVoice) dataset. This repository is primarily intended for inference purposes.

The model can be accessed at: [viXTTS on Hugging Face](https://huggingface.co/capleaf/viXTTS)

## Online usage (Recommended)

For a quick demonstration, please refer to [this notebook](./viXTTS_Demo.ipynb) on Google Colab.
![viXTTS Colab Demo](assets/vixtts_colab.png)
## Local Usage

This code is specifically designed for running on Ubuntu or WSL2. It is not intended for use on macOS or Windows systems (might available later).
![viXTTS Gradio Demo](assets/vixtts_gradio_ui.png)
### Hardware Recommendations

- At least 10GB of free disk space
- At least 16GB of RAM
- **Nvidia GPU** with a minimum of 4GB of VRAM
- By default, the model will utilize the GPU. In the absence of a GPU, it will run on the CPU and run much slower.

### Required Software

- Git
- Python version >=3.9 and <= 3.11. The default version is set to 3.11, but you can modify the Python version in the `run.sh` file.

### Usage

```bash
git clone https://github.com/thinhlpg/vixtts-demo
cd vixtts-demo
./run.sh
```
1. Run `run.sh` (dependencies will be automatically installed for the first run).
2. Access the Gradio demo link.
3. Load the model and wait for it to load.
4. Inference and Enjoy 🤗
5. The result will be saved in `output/`

## Acknowledgements

We would like to express our gratitude to all libraries, and resources that have played a role in the development of this demo, especially:

- [Coqui TTS](https://github.com/coqui-ai/TTS) for XTTS foundation model and inference code
- [Vinorm](https://github.com/v-nhandt21/Vinorm) and [Undethesea](https://github.com/undertheseanlp/underthesea) for Vietnamese text normalization
- [Deepspeed](https://github.com/microsoft/DeepSpeed) for fast inference
- [Huggingface Hub](https://huggingface.co/) for hosting the model
- [Gradio](https://www.gradio.app/) for web UI

## Contact

- You can message me directly on Facebook: <https://fb.com/thinhlpg/> (preferred 🤗)
- GitHub: <https://github.com/thinhlpg>
- Email: <[email protected]> or <[email protected]>
Binary file added assets/vixtts_colab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added assets/vixtts_gradio_ui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 089f2b3

Please sign in to comment.