Skip to content

Latest commit

 

History

History
 
 

mobvoi

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 The Mobvoi dataset is a ~67-hour corpus of wake word corpus
 in Chinese covering 523 speakers. It is currently not publicly available.
 The wake word is "Hi Xiaowen" (in Pinyin).
 Each speaker’s collection includes positive utterances and negative utterances
 recorded with different speaker-to-microphone distance and different
 signal-to-noise (SNR) ratio where noises are from typical home environments.
 The dataset is provided by Mobvoi. Inc.

 The recipe is in v1/

 The E2E LF-MMI recipe does not require any prior alignments for training
 LF-MMI, making the alignment more flexible during training. It can be optionally
 followed by a regular LF-MMI training to further improve the performance.