aishell |
AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus |
ASR |
ZH |
http://www.aishelltech.com/kysjcp |
|
ami |
The AMI Meeting Corpus |
ASR |
EN |
http://groups.inf.ed.ac.uk/ami/corpus/ |
|
an4 |
CMU AN4 database |
ASR/TTS |
EN |
http://www.speech.cs.cmu.edu/databases/an4/ |
|
arctic |
CMU ARCTIC databases |
TTS |
EN |
http://www.festvox.org/cmu_arctic/ |
|
aurora4 |
Aurora-4 database |
ASR |
EN |
http://aurora.hsnr.de/aurora-4.html |
|
babel |
IARPA Babel corups |
ASR |
~20 Languages |
https://www.iarpa.gov/index.php/research-programs/babel |
|
blizzard_2017 |
Blizzard Challenge 2017 |
TTS |
EN |
https://www.synsig.org/index.php/Blizzard_Challenge_2017 |
|
chime4 |
The 4th CHiME Speech Separation and Recognition Challenge |
ASR/Multichannel ASR |
EN |
http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/ |
|
chime5 |
The 5th CHiME Speech Separation and Recognition Challenge |
ASR |
EN |
http://spandh.dcs.shef.ac.uk/chime_challenge/ |
|
cmu_wilderness |
CMU Wilderness Multilingual Speech Dataset |
Multilingual ASR |
~100 Languages |
https://github.com/festvox/datasets-CMU_Wilderness |
|
commonvoice |
The Mozilla Common Voice |
ASR |
13 Languages |
https://voice.mozilla.org/datasets |
|
csj |
Corpus of Spontaneous Japanese |
ASR |
JP |
https://pj.ninjal.ac.jp/corpus_center/csj/en/ |
|
csmsc |
Chinese Standard Mandarin Speech Copus |
TTS |
ZH |
https://www.data-baker.com/open_source.html |
|
dirha_wsj |
Distant-speech Interaction for Robust Home Applications |
Multi-Array ASR |
EN |
https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj |
|
fisher_callhome_spanish |
Fisher and CALLHOME Spanish--English Speech Translation |
ASR/Machine Translation/Speech Translation |
ES->EN |
https://catalog.ldc.upenn.edu/LDC2014T23 |
|
fisher_swbd |
Fisher English Training Speech, Switchboard-1 Release 2 |
ASR |
EN |
https://catalog.ldc.upenn.edu/LDC2004S13, https://catalog.ldc.upenn.edu/LDC2005S13, https://catalog.ldc.upenn.edu/LDC97S62 |
|
hkust |
HKUST Mandarin Telephone Speech |
ASR |
ZH |
https://catalog.ldc.upenn.edu/LDC2005S15, https://catalog.ldc.upenn.edu/LDC2005T32 |
|
how2 |
How2: A Large-scale Dataset for Multimodal Language Understanding |
ASR/Machine Translation/Speech Translation |
EN->PT |
https://github.com/srvk/how2-dataset |
|
hub4_spanish |
1997 Spanish Broadcast News Speech (HUB4-NE) |
ASR |
ES |
https://catalog.ldc.upenn.edu/LDC98S74, https://catalog.ldc.upenn.edu/LDC98T29 |
|
iwslt18 |
International Workshop on Spoken Language Translation 2018 |
ASR/Machine Translation/Speech Translation |
EN->DE |
https://sites.google.com/site/iwsltevaluation2018/Lectures-task |
|
jnas |
ASJ Japanese Newspaper Article Sentences Read Speech Corpus (JNAS) |
ASR/TTS |
JP |
http://research.nii.ac.jp/src/JNAS.html |
|
jsalt18e2e |
Multilingual End-to-end ASR for Incomplete Data Benchmark |
Multilingual ASR |
~20 Languages |
https://www.clsp.jhu.edu/workshops/18-workshop/multilingual-end-end-asr-incomplete-data/ |
babel+ |
jsut |
Japanese speech corpus of Saruwatari-lab., University of Tokyo |
ASR/TTS |
JP |
https://sites.google.com/site/shinnosuketakamichi/publication/jsut |
|
jvs |
JVS (Japanese versatile speech) corpus |
TTS |
JP |
https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus |
|
li10 |
Lanugage-Independent ASR task (10 languages) |
Multilingual ASR |
~10 Languages |
https://www.merl.com/publications/docs/TR2017-182.pdf |
csj+hkust+voxforge(7lang)+wsj |
libri_trans |
Translation Augmented LibriSpeech Corpus |
ASR/Machine Translation/Speech Translation |
|
https://persyval-platform.univ-grenoble-alpes.fr/DS91/detaildataset |
|
librispeech |
LibriSpeech ASR corpus |
ASR |
EN |
http://www.openslr.org/12 |
|
libritts |
LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech |
TTS |
EN |
http://www.openslr.org/60/ |
|
ljspeech |
The LJ Speech Dataset |
TTS |
EN |
https://keithito.com/LJ-Speech-Dataset/ |
|
m_ailabs |
The M-AILABS Speech Dataset |
TTS |
~5 languages |
https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/ |
|
must_c |
Must-C Multilingual Speech Translation Corpus |
ASR/Machine Translation/Speech Translation |
EN->{DE, ES, FR, IT, NL, PT, RO, RU} |
https://ict.fbk.eu/must-c/ |
|
reverb |
REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge |
ASR |
EN |
https://reverb2014.dereverberation.com/ |
|
ru_open_stt |
Russian Open Speech To Text (STT/ASR) Dataset |
ASR |
RU |
https://github.com/snakers4/open_stt |
|
swbd |
The Switchboard corpus |
ASR |
EN |
https://catalog.ldc.upenn.edu/LDC97S62 |
|
tedlium2 |
TED-LIUM corpus release 2 |
ASR |
EN |
https://www.openslr.org/19/, http://www.lrec-conf.org/proceedings/lrec2014/pdf/1104_Paper.pdf |
|
tedlium3 |
TED-LIUM corpus release 3 |
ASR |
EN |
http://www.openslr.org/51/, https://arxiv.org/pdf/1805.04699 |
|
timit |
TIMIT Acoustic-Phonetic Continuous Speech Corpus |
ASR |
EN |
https://catalog.ldc.upenn.edu/LDC93S1 |
|
tweb |
The World English Bible |
TTS |
EN |
https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset |
|
vais1000 |
VAIS-1000 |
TTS |
VI |
https://ieee-dataport.org/documents/vais-1000-vietnamese-speech-synthesis-corpus |
|
vivos |
VIVOS (Vietnamese corpus for ASR) |
ASR |
VI |
https://ailab.hcmus.edu.vn/vivos/ |
|
voxforge |
VoxForge |
ASR |
7 languages |
http://www.voxforge.org/ |
|
wsj |
CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete |
ASR |
EN |
https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A |
|
wsj_mix |
MERL WSJ0-mix multi-speaker dataset |
Multispeaker ASR |
EN |
http://www.merl.com/demos/deep-clustering |
|
yesno |
The "yesno" corpus |
ASR |
HE |
http://www.openslr.org/1 |
|