domain

Prerequisite

For inference:

GRAM >= 2 GiB

RAM >= 16 GiB

CUDA or CPU

For training:

GRAM >= 6 GiB (1 batch size or so)

RAM >= 24 GiB

CUDA supported (AMD ROCm : linux only)

For Quick Start

Clone the repository

 git clone https://www.modelscope.cn/studios/SpicyqSama007/Bert-VITS2-v2.3-clap.git

Configure necessary environments

# make sure that you've installed anaconda/miniconda, CUDA (tool kit),
# minimum GRAM >= 2GB
# (python virtual env is ok too, similiar)
# here is the example for miniconda

# open 'miniconda' terminal
# we need python>=3.10 for compatibility
conda create -n vits python=3.10
conda activate vits
cd /d "{path of the project e.g. 'D:/Py/Bert-VITS2-v2.3-clap' }"

# install torch first (CUDA ver >= 12.1)
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# then the requirement
pip install -r requirements.txt

# extra for asr : audio wave -> corresponding transcription
pip install funasr modelscope openi

Configure yaml file

# auto-generate a config file
python config.py
# then , a file name 'config.yml' generated at the root of the project
mkdir -p Data/mix/models
# your model
mv G_xxx.pth Data/mix/models
mv config.json Data/mix/

# or manually open 'config.yml'
vi config.yml

# after `vi config.yml` we get into it
...

# 不填或者填空则路径为相对于项目根目录的路径

# fill the path (relatively to the root)
dataset_path: "Data/mix"
# it means the dataset path is in '{...}/Bert-VITS2-v2.3-clap/Data/mix'

...

# webui webui配置
# 注意， “:” 后需要加空格
webui:
# 推理设备 device
device: "cuda"
# 模型路径 path to the model
model: "models/G_xxxx.pth"
# 配置文件路径
config_path: "config.json"
# 端口号
port: 7860
# 是否公开部署，对外网开放
share: false
# 是否开启debug模式
debug: false
# 语种识别库，可选langid, fastlid
language_identification_library: "langid"

After configuration, let's start webui.py

python webui.py

For Quick Finetuning

Same as the first 2-steps in 'For Quick Start'

if you understand Chinese, you can use my UI, as what the name literally means

python all_process.py

Or :

Configure yaml file

# after `vi config.yml` we get into it
# or manually open yml file

...

# 不填或者填空则路径为相对于项目根目录的路径
# fill the path (relatively to the root)
dataset_path: "Data/mix"
# it means the dataset path is in '{...}/Bert-VITS2-v2.3-clap/Data/mix'

...


resample:
  sampling_rate: 44100
  in_dir: "audios/raw" # relatively in '/datasetPath/in_dir'
  out_dir: "audios/wavs"



preprocess_text:
  # Format for single line in the list: 
  # {wav_path}|{speaker_name}|{language}|{text}。
  transcription_path: "filelists/{custom}.list"
  cleaned_path: ""
  train_path: "filelists/train.list"
  val_path: "filelists/val.list"
  config_path: "config.json"
  val_per_lang: 4
  max_val_total: 12
  clean: true

# train 训练配置
# 注意， “:” 后需要加空格
train_ms:

  model: "models"
  # 配置文件路径
  config_path: "config.json"
  # 训练使用的worker，不建议超过CPU核心数
  num_workers: 16
  # 关闭此项可以节约接近50%的磁盘空间，但是可能导致实际训练速度变慢和更高的CPU使用率。
  spec_cache: True
  # 保存的检查点数量，多于此数目的权重会被删除来节省空间。
  keep_ckpts: 8

After editing , save and quit.

Then, we should configure the training 'config.json'

# or manually open
vi Data/mix/config.json
# if there is no config.json, copy one piece from 'configs/config.json'
cp configs/config.json Data/mix/

# pay attention to the following part

 "data": {
    "training_files": "Data/mix/filelists/train.list",
    "validation_files": "Data/mix/filelists/val.list",
    "max_wav_value": 32768.0,
    ...
 }

Then, save and quit.

Come back to the Terminal ( it's fast and easy), type the following command.

# your source audio files (only .wav)
mv "{your_audios_folder}" Data/mix/audios/raw
# "{custom}.list" contains your transcriptions of the wav files
mkdir -p Data/mix/filelists
mv "{custom}.list"  Data/mix/filelists
python resample.py
python preprocess_text.py
python bert_gen.py
python clap_gen.py

torchrun \
    --nnodes=1\
    --nproc_per_node=2\
    train_ms.py

If all normal and successful, we can get the trained models in Data/mix/model/G_xxxx.pth

Tips:
1. You can terminate the training process at any time by yourself. (after 'Saving models...')
2. Most of the problems take place with 'incorrect path '

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Prerequisite

For Quick Start

For Quick Finetuning

Files

README.md

Latest commit

History

README.md

File metadata and controls

Prerequisite

For Quick Start

For Quick Finetuning