Name		Name	Last commit message	Last commit date
parent directory ..
aishell/asr1		aishell/asr1
ami/asr1		ami/asr1
an4		an4
arctic		arctic
aurora4/asr1		aurora4/asr1
babel/asr1		babel/asr1
blizzard17/tts1		blizzard17/tts1
chime4		chime4
chime5/asr1		chime5/asr1
cmu_wilderness		cmu_wilderness
commonvoice/asr1		commonvoice/asr1
csj/asr1		csj/asr1
csmsc/tts1		csmsc/tts1
dirha_wsj/asr1		dirha_wsj/asr1
fisher_callhome_spanish		fisher_callhome_spanish
fisher_swbd/asr1		fisher_swbd/asr1
hkust/asr1		hkust/asr1
how2		how2
hub4_spanish/asr1		hub4_spanish/asr1
iwslt18		iwslt18
iwslt19/asr1		iwslt19/asr1
jnas		jnas
jsalt18e2e/asr1		jsalt18e2e/asr1
jsut		jsut
jvs		jvs
li10/asr1		li10/asr1
libri_trans		libri_trans
librispeech		librispeech
libritts/tts1		libritts/tts1
ljspeech		ljspeech
m_ailabs/tts1		m_ailabs/tts1
mini_an4		mini_an4
must_c		must_c
reverb		reverb
ru_open_stt/asr1		ru_open_stt/asr1
swbd/asr1		swbd/asr1
tedlium2/asr1		tedlium2/asr1
tedlium3/asr1		tedlium3/asr1
timit/asr1		timit/asr1
tweb		tweb
vais1000/tts1		vais1000/tts1
vivos		vivos
voxforge/asr1		voxforge/asr1
wsj/asr1		wsj/asr1
wsj_mix/asr1		wsj_mix/asr1
yesno		yesno
README.md		README.md

README.md

Overview of example information

Directory name	Corpus name	Task	Language	URL	Note
aishell	AISHELL-ASR0009-OS1 Open Source Mandarin Speech Corpus	ASR	ZH	http://www.aishelltech.com/kysjcp
ami	The AMI Meeting Corpus	ASR	EN	http://groups.inf.ed.ac.uk/ami/corpus/
an4	CMU AN4 database	ASR/TTS	EN	http://www.speech.cs.cmu.edu/databases/an4/
arctic	CMU ARCTIC databases	TTS	EN	http://www.festvox.org/cmu_arctic/
aurora4	Aurora-4 database	ASR	EN	http://aurora.hsnr.de/aurora-4.html
babel	IARPA Babel corups	ASR	~20 Languages	https://www.iarpa.gov/index.php/research-programs/babel
blizzard_2017	Blizzard Challenge 2017	TTS	EN	https://www.synsig.org/index.php/Blizzard_Challenge_2017
chime4	The 4th CHiME Speech Separation and Recognition Challenge	ASR/Multichannel ASR	EN	http://spandh.dcs.shef.ac.uk/chime_challenge/chime2016/
chime5	The 5th CHiME Speech Separation and Recognition Challenge	ASR	EN	http://spandh.dcs.shef.ac.uk/chime_challenge/
cmu_wilderness	CMU Wilderness Multilingual Speech Dataset	Multilingual ASR	~100 Languages	https://github.com/festvox/datasets-CMU_Wilderness
commonvoice	The Mozilla Common Voice	ASR	13 Languages	https://voice.mozilla.org/datasets
csj	Corpus of Spontaneous Japanese	ASR	JP	https://pj.ninjal.ac.jp/corpus_center/csj/en/
csmsc	Chinese Standard Mandarin Speech Copus	TTS	ZH	https://www.data-baker.com/open_source.html
dirha_wsj	Distant-speech Interaction for Robust Home Applications	Multi-Array ASR	EN	https://dirha.fbk.eu/, https://github.com/SHINE-FBK/DIRHA_English_wsj
fisher_callhome_spanish	Fisher and CALLHOME Spanish--English Speech Translation	ASR/Machine Translation/Speech Translation	ES->EN	https://catalog.ldc.upenn.edu/LDC2014T23
fisher_swbd	Fisher English Training Speech, Switchboard-1 Release 2	ASR	EN	https://catalog.ldc.upenn.edu/LDC2004S13, https://catalog.ldc.upenn.edu/LDC2005S13, https://catalog.ldc.upenn.edu/LDC97S62
hkust	HKUST Mandarin Telephone Speech	ASR	ZH	https://catalog.ldc.upenn.edu/LDC2005S15, https://catalog.ldc.upenn.edu/LDC2005T32
how2	How2: A Large-scale Dataset for Multimodal Language Understanding	ASR/Machine Translation/Speech Translation	EN->PT	https://github.com/srvk/how2-dataset
hub4_spanish	1997 Spanish Broadcast News Speech (HUB4-NE)	ASR	ES	https://catalog.ldc.upenn.edu/LDC98S74, https://catalog.ldc.upenn.edu/LDC98T29
iwslt18	International Workshop on Spoken Language Translation 2018	ASR/Machine Translation/Speech Translation	EN->DE	https://sites.google.com/site/iwsltevaluation2018/Lectures-task
jnas	ASJ Japanese Newspaper Article Sentences Read Speech Corpus (JNAS)	ASR/TTS	JP	http://research.nii.ac.jp/src/JNAS.html
jsalt18e2e	Multilingual End-to-end ASR for Incomplete Data Benchmark	Multilingual ASR	~20 Languages	https://www.clsp.jhu.edu/workshops/18-workshop/multilingual-end-end-asr-incomplete-data/	babel+
jsut	Japanese speech corpus of Saruwatari-lab., University of Tokyo	ASR/TTS	JP	https://sites.google.com/site/shinnosuketakamichi/publication/jsut
jvs	JVS (Japanese versatile speech) corpus	TTS	JP	https://sites.google.com/site/shinnosuketakamichi/research-topics/jvs_corpus
li10	Lanugage-Independent ASR task (10 languages)	Multilingual ASR	~10 Languages	https://www.merl.com/publications/docs/TR2017-182.pdf	csj+hkust+voxforge(7lang)+wsj
libri_trans	Translation Augmented LibriSpeech Corpus	ASR/Machine Translation/Speech Translation		https://persyval-platform.univ-grenoble-alpes.fr/DS91/detaildataset
librispeech	LibriSpeech ASR corpus	ASR	EN	http://www.openslr.org/12
libritts	LibriTTS: A Corpus Derived from LibriSpeech for Text-to-Speech	TTS	EN	http://www.openslr.org/60/
ljspeech	The LJ Speech Dataset	TTS	EN	https://keithito.com/LJ-Speech-Dataset/
m_ailabs	The M-AILABS Speech Dataset	TTS	~5 languages	https://www.caito.de/2019/01/the-m-ailabs-speech-dataset/
must_c	Must-C Multilingual Speech Translation Corpus	ASR/Machine Translation/Speech Translation	EN->{DE, ES, FR, IT, NL, PT, RO, RU}	https://ict.fbk.eu/must-c/
reverb	REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge	ASR	EN	https://reverb2014.dereverberation.com/
ru_open_stt	Russian Open Speech To Text (STT/ASR) Dataset	ASR	RU	https://github.com/snakers4/open_stt
swbd	The Switchboard corpus	ASR	EN	https://catalog.ldc.upenn.edu/LDC97S62
tedlium2	TED-LIUM corpus release 2	ASR	EN	https://www.openslr.org/19/, http://www.lrec-conf.org/proceedings/lrec2014/pdf/1104_Paper.pdf
tedlium3	TED-LIUM corpus release 3	ASR	EN	http://www.openslr.org/51/, https://arxiv.org/pdf/1805.04699
timit	TIMIT Acoustic-Phonetic Continuous Speech Corpus	ASR	EN	https://catalog.ldc.upenn.edu/LDC93S1
tweb	The World English Bible	TTS	EN	https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset
vais1000	VAIS-1000	TTS	VI	https://ieee-dataport.org/documents/vais-1000-vietnamese-speech-synthesis-corpus
vivos	VIVOS (Vietnamese corpus for ASR)	ASR	VI	https://ailab.hcmus.edu.vn/vivos/
voxforge	VoxForge	ASR	7 languages	http://www.voxforge.org/
wsj	CSR-I (WSJ0) Complete, CSR-II (WSJ1) Complete	ASR	EN	https://catalog.ldc.upenn.edu/LDC93S6A,https://catalog.ldc.upenn.edu/LDC94S13A
wsj_mix	MERL WSJ0-mix multi-speaker dataset	Multispeaker ASR	EN	http://www.merl.com/demos/deep-clustering
yesno	The "yesno" corpus	ASR	HE	http://www.openslr.org/1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

egs

egs

README.md

Overview of example information

Files

egs

Directory actions

More options

Directory actions

More options

Latest commit

History

egs

Folders and files

parent directory

README.md

Overview of example information