Name		Name	Last commit message	Last commit date
parent directory ..
tts1		tts1
tts2		tts2
README.md		README.md

README.md

LJSpeech Text-to-Speech recipe

Written by Tomoki Hayashi @ Nagoya University (2018/09/09)

tts1 recipe

tts1 recipe is based on Tacotron2 [1] (spectrogram prediction network) w/o WaveNet. Tacotron2 generates log mel-filter bank from text and then converts it to linear spectrogram using inverse mel-basis. Finally, phase components are recovered with Griffin-Lim.

(2019/06/16) we also support TTS-Transformer [3]. (2019/06/17) we also support Feed-forward Transformer [4].

tts2 recipe

tts2 recipe is based on Tacotron2's spectrogram prediction network [1] and Tacotron's CBHG module [2]. Instead of using inverse mel-basis, CBHG module is used to convert log mel-filter bank to linear spectrogram. The recovery of the phase components is the same as tts1.

Reference

[1] Shen, Jonathan, et al. "Natural TTS synthesis by conditioning wavenet on mel spectrogram predictions." arXiv preprint arXiv:1712.05884 (2017).
[2] Wang, Yuxuan, et al. "Tacotron: Towards end-to-end speech synthesis." arXiv preprint arXiv:1703.10135 (2017).
[3] Li, Naihan, et al. "Close to human quality TTS with transformer." arXiv preprint arXiv:1809.08895 (2018).
[4] Ren, Yi, et al. "FastSpeech: Fast, Robust and Controllable Text to Speech." arXiv preprint arXiv:1905.09263 (2019).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ljspeech

ljspeech

README.md

LJSpeech Text-to-Speech recipe

tts1 recipe

tts2 recipe

Reference

Files

ljspeech

Directory actions

More options

Directory actions

More options

Latest commit

History

ljspeech

Folders and files

parent directory

README.md

LJSpeech Text-to-Speech recipe

tts1 recipe

tts2 recipe

Reference