Skip to content

An unofficial PyTorch implementation of Mix-Phoneme-Bert

Notifications You must be signed in to change notification settings

MaxMax2016/mix-phoneme-bert

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An unofficial PyTorch implementation of Mix-Phoneme-Bert(Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech)

model

Installation

This implementation is based on the fairseq library.

cd fairseq

Then follow the instructions here(fairseq) to install it.

Data Prepare

Prepare your data like file /data/data.txt then split the data to train, test, dev sets and do some preprocessing for bpe learning.

bash prepare_data.sh

Learing bpe vocab, you can change the vocab size.

bash bpe.sh

Prepare the mmap bin files for training.

bash preprocess.sh

Training

bash train.sh

Note: The learning rate and batch size are tightly connected and need to be adjusted together. We generally recommend increasing the learning rate as you increase the batch size according to the following table (although it's also dataset dependent, so don't rely on the following values too closely):

batch size peak learning rate
256 0.0001
2048 0.0005
8192 0.0007

You can set this parameters in fairseq/examples/roberta/config/pretraining/base.yaml

Curves

Training on 1 bilion lines text data with 8 V100 32G GPUs Still runing.

loss

loss

loss_phoneme

lossp

loss_sup_phoneme

losssp

acc_phoneme

lossp

acc_sup_phoneme

losssp

Citations

@artical{zhang2022Mix-PB,
 author = {Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao},
 title = {Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech},
 url = {https://arxiv.org/abs/2203.17190}, 
 year = {2022}
}

About

An unofficial PyTorch implementation of Mix-Phoneme-Bert

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 64.7%
  • OpenEdge ABL 20.7%
  • SourcePawn 11.7%
  • Shell 2.0%
  • Cuda 0.5%
  • C++ 0.3%
  • Other 0.1%