Directory name | Corpus name | Task | Language | URL | Note |
---|---|---|---|---|---|
aishell | AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus | ASR | ZH | http://www.aishelltech.com/kysjcp | |
ami | The AMI Meeting Corpus | ASR | EN | http://groups.inf.ed.ac.uk/ami/corpus/ | |
an4 | CMU AN4 database | ASR/TTS | EN | http://www.speech.cs.cmu.edu/databases/an4/ | |
arctic | CMU ARCTIC databases | TTS | EN | http://www.festvox.org/cmu_arctic/ | |
aurora4 | Aurora-4 database | ASR | EN | http://aurora.hsnr.de/aurora-4.html | |
babel | IARPA Babel corups | ASR | ~20 Languages | https://www.iarpa.gov/index.php/research-programs/babel | |
blizzard_2017 | Blizzard Challenge 2017 | TTS | EN | https://www.synsig.org/index.php/Blizzard_Challenge_2017 | |
chime4 | The 4th CHiME Speech Separation and Recognition Challenge | ASR/Multichannel ASR | EN | http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/ | |
chime5 | The 5th CHiME Speech Separation and Recognition Challenge | ASR | EN | http://spandh.dcs.shef.ac.uk/chime_challenge/ | |
cmu_wilderness | CMU Wilderness Multilingual Speech Dataset | Multilingual ASR | ~100 Languages | https://github.com/festvox/datasets-CMU_Wilderness | |
commonvoice | The Mozilla Common Voice | ASR | 13 Languages | https://voice.mozilla.org/datasets | |
csj | Corpus of Spontaneous Japanese | ASR | JP | https://pj.ninjal.ac.jp/corpus_center/csj/en/ | |
csmsc | Chinese Standard Mandarin Speech Copus | TTS | ZH | https://www.data-baker.com/open_source.html | |
dirha_wsj | Distant-speech Interaction for Robust Home Applications | Multi-Array ASR | EN | https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj | |
fisher_callhome_spanish | Fisher and CALLHOME Spanish--English Speech Translation | ASR/Machine Translation/Speech Translation | ES->EN | https://catalog.ldc.upenn.edu/LDC2014T23 | |
fisher_swbd | Fisher English Training Speech, Switchboard-1 Release 2 | ASR | EN | https://catalog.ldc.upenn.edu/LDC2004S13, https://catalog.ldc.upenn.edu/LDC2005S13, https://catalog.ldc.upenn.edu/LDC97S62 | |
hkust | HKUST Mandarin Telephone Speech | ASR | ZH | https://catalog.ldc.upenn.edu/LDC2005S15, https://catalog.ldc.upenn.edu/LDC2005T32 | |
how2 | How2: A Large-scale Dataset for Multimodal Language Understanding | ASR/Machine Translation/Speech Translation | EN->PT | https://github.com/srvk/how2-dataset | |
hub4_spanish | 1997 Spanish Broadcast News Speech (HUB4-NE) | ASR | ES | https://catalog.ldc.upenn.edu/LDC98S74, https://catalog.ldc.upenn.edu/LDC98T29 | |
iwslt18 | International Workshop on Spoken Language Translation 2018 | ASR/Machine Translation/Speech Translation | EN->DE | https://sites.google.com/site/iwsltevaluation2018/Lectures-task | |
jnas | ASJ Japanese Newspaper Article Sentences Read Speech Corpus (JNAS) | ASR/TTS | JP | http://research.nii.ac.jp/src/JNAS.html | |
jsalt18e2e | Multilingual End-to-end ASR for Incomplete Data Benchmark | Multilingual ASR | ~20 Languages | https://www.clsp.jhu.edu/workshops/18-workshop/multilingual-end-end-asr-incomplete-data/ | babel+ |
jsut | Japanese speech corpus of Saruwatari-lab., University of Tokyo | ASR/TTS | JP | https://sites.google.com/site/shinnosuketakamichi/publication/jsut | |
jvs | JVS (Japanese versatile speech) corpus | TTS | JP | https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus | |
li10 | Lanugage-Independent ASR task (10 languages) | Multilingual ASR | ~10 Languages | https://www.merl.com/publications/docs/TR2017-182.pdf | csj+hkust+voxforge(7lang)+wsj |
libri_trans | Translation Augmented LibriSpeech Corpus | ASR/Machine Translation/Speech Translation | https://persyval-platform.univ-grenoble-alpes.fr/DS91/detaildataset | ||
librispeech | LibriSpeech ASR corpus | ASR | EN | http://www.openslr.org/12 | |
libritts | LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech | TTS | EN | http://www.openslr.org/60/ | |
ljspeech | The LJ Speech Dataset | TTS | EN | https://keithito.com/LJ-Speech-Dataset/ | |
m_ailabs | The M-AILABS Speech Dataset | TTS | ~5 languages | https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/ | |
must_c | Must-C Multilingual Speech Translation Corpus | ASR/Machine Translation/Speech Translation | EN->{DE, ES, FR, IT, NL, PT, RO, RU} | https://ict.fbk.eu/must-c/ | |
reverb | REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge | ASR | EN | https://reverb2014.dereverberation.com/ | |
ru_open_stt | Russian Open Speech To Text (STT/ASR) Dataset | ASR | RU | https://github.com/snakers4/open_stt | |
swbd | The Switchboard corpus | ASR | EN | https://catalog.ldc.upenn.edu/LDC97S62 | |
tedlium2 | TED-LIUM corpus release 2 | ASR | EN | https://www.openslr.org/19/, http://www.lrec-conf.org/proceedings/lrec2014/pdf/1104_Paper.pdf | |
tedlium3 | TED-LIUM corpus release 3 | ASR | EN | http://www.openslr.org/51/, https://arxiv.org/pdf/1805.04699 | |
timit | TIMIT Acoustic-Phonetic Continuous Speech Corpus | ASR | EN | https://catalog.ldc.upenn.edu/LDC93S1 | |
tweb | The World English Bible | TTS | EN | https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset | |
vais1000 | VAIS-1000 | TTS | VI | https://ieee-dataport.org/documents/vais-1000-vietnamese-speech-synthesis-corpus | |
vivos | VIVOS (Vietnamese corpus for ASR) | ASR | VI | https://ailab.hcmus.edu.vn/vivos/ | |
voxforge | VoxForge | ASR | 7 languages | http://www.voxforge.org/ | |
wsj | CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete | ASR | EN | https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A | |
wsj_mix | MERL WSJ0-mix multi-speaker dataset | Multispeaker ASR | EN | http://www.merl.com/demos/deep-clustering | |
yesno | The "yesno" corpus | ASR | HE | http://www.openslr.org/1 |
egs
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||