"If you can’t measure it, you can’t improve it." -- Peter Drucker
SpeechIO leaderboard serves as an ASR benchmarking platform by providing 3 components:
-
TestSet Zoo: A collection of test sets covering wide range of speech recognition tasks & scenarios
-
Model Zoo: A collection of models including commercial APIs & open-sourced models
-
Benchmarking Pipeline: a simple & well-specified pipeline to take care of data preparation / recognition / post processing / error rate evaluation.
People should be able to easily benchmark, reproduce, examine ASR systems from each other
Academic Test Sets (EN & ZH)
已公开 UNLOCKED |
编号 DATASET_ID |
说明 DESCRIPTION |
语言 LANGUAGE |
---|---|---|---|
✓ | AISHELL1_TEST | test set of AISHELL-1 | zh |
✓ | AISHELL2_IOS_TEST | test set of AISHELL-2 (iOS channel) | zh |
✓ | AISHELL2_ANDROID_TEST | test set of AISHELL-2 (Android channel) | zh |
✓ | AISHELL2_MIC_TEST | test set of AISHELL-2 (Microphone channel) | zh |
✓ | ALIMEETING_EVAL_NEAR_FIELD | AliMeeting | zh |
✓ | ALIMEETING_TEST_NEAR_FIELD | AliMeeting | zh |
✓ | ALIMEETING_EVAL_FAR_FIELD | AliMeeting | zh |
✓ | ALIMEETING_TEST_FAR_FIELD | AliMeeting | zh |
✓ | LIBRISPEECH_TEST_CLEAN | "test_clean" set of LibriSpeech | en |
✓ | LIBRISPEECH_TEST_OTHER | "test_other" set of LibriSpeech | en |
✓ | TEDLIUM_RELEASE3_LEGACY_DEV | tedlium release 3, legacy dir dev set TEDLium3 | en |
✓ | TEDLIUM_RELEASE3_LEGACY_TEST | tedlium release 3, legacy dir test set TEDLium3 | en |
✓ | GIGASPEECH_V1.0.0_DEV | dev set of GigaSpeech | en |
✓ | GIGASPEECH_V1.0.0_TEST | test set of GigaSpeech | en |
✓ | VOXPOPULI_V1.0_EN_DEV | dev set of VoxPopuli | en |
✓ | VOXPOPULI_V1.0_EN_TEST | test set of VoxPopuli | en |
✓ | VOXPOPULI_V1.0_EN_ACCENTED_TEST | accented test set of VoxPopuli | en |
✓ | COMMON_VOICE_V11.0_DEV | dev set of Common Voice | en |
✓ | COMMON_VOICE_V11.0_TEST | test set of Common Voice | en |
SpeechIO Test Sets (ZH)
SpeechIO test sets are carefully curated by SpeechIO authors, crawled from publicly available sources (Youtube, TV programs, Podcast etc), covering various well-known scenarios and topics, transcribed by payed professional annotators.
已公开 UNLOCKED |
编号 DATASET_ID |
名称 NAME |
场景 SCENARIO |
内容领域 TOPIC |
时长 HOURS |
难度(1-5) DIFFICULTY |
---|---|---|---|---|---|---|
✓ | SPEECHIO_ASR_ZH00000 | 调试集 for debugging |
视频会议、论坛演讲 conference & speech |
经济、货币、金融 economy, currency, finance |
1.0 | ★★☆ |
✓ | SPEECHIO_ASR_ZH00001 | 新闻联播 | 新闻播报 TV News |
时政 news & politics |
9 | ★ |
✓ | SPEECHIO_ASR_ZH00002 | 鲁豫有约 | 访谈电视节目 TV interview |
名人工作/生活 celebrity & film & music & daily |
3 | ★★☆ |
✓ | SPEECHIO_ASR_ZH00003 | 天下足球 | 专题电视节目 TV program |
足球 Sports & Football & Worldcup |
2.7 | ★★☆ |
✓ | SPEECHIO_ASR_ZH00004 | 罗振宇跨年演讲 | 会场演讲 Stadium Public Speech |
社会、人文、商业 Society & Culture & Business Trend |
2.7 | ★★ |
✓ | SPEECHIO_ASR_ZH00005 | 李永乐讲堂 | 在线教育 Online Education |
科普 Popular Science |
4.4 | ★★★ |
✓ | SPEECHIO_ASR_ZH00006 | 王者荣耀 张大仙 & 骚白 |
直播 Live Broadcasting |
游戏 Game |
1.6 | ★★★☆ |
✓ | SPEECHIO_ASR_ZH00007 | 直播带货 李佳琪 & 薇娅 |
直播 Live Broadcasting |
电商、美妆 Makeup & Online shopping/advertising |
0.9 | ★★★★☆ |
✓ | SPEECHIO_ASR_ZH00008 | 老罗语录 | 线下培训 Offline lecture |
段子、做人 Life & Purpose & Ethics |
1.3 | ★★★★☆ |
✓ | SPEECHIO_ASR_ZH00009 | 故事FM | 播客 Podcast |
人生故事、见闻 Ordinary Life Story Telling |
4.5 | ★★☆ |
✓ | SPEECHIO_ASR_ZH00010 | 创业内幕 | 播客 Podcast |
创业、产品、投资 Startup & Enterprenuer & Product & Investment |
4.2 | ★★☆ |
✓ | SPEECHIO_ASR_ZH00011 | 罗翔刑法法考 | 在线教育 Online Education |
法律 法考 Law & Lawyer Qualification Exams |
3.4 | ★★☆ |
✓ | SPEECHIO_ASR_ZH00012 | 张雪峰考研 | 在线教育 Online Education |
考研 高校报考 University & Graduate School Entrance Exams |
3.4 | ★★★☆ |
✓ | SPEECHIO_ASR_ZH00013 | 谷阿莫 牛叔说电影 |
短视频 VLog |
电影剪辑 Movie Cuts |
1.8 | ★★★ |
✓ | SPEECHIO_ASR_ZH00014 | 贫穷料理 琼斯爱生活 |
短视频 VLog |
美食、烹饪 Food & Cooking & Gourmet |
1 | ★★★☆ |
✓ | SPEECHIO_ASR_ZH00015 | 单田芳 白眉大侠 | 评书 Traditional Podcast |
江湖、武侠 Kongfu Fiction |
2.2 | ★★☆ |
✗ | SPEECHIO_ASR_ZH00016 | 德云社演出 | 剧场相声 Theater Crosstalk Show |
包袱段子 Funny Stories |
1 | ★★★ |
✗ | SPEECHIO_ASR_ZH00017 | 吐槽大会 | 脱口秀电视节目 Standup Comedy |
明星糗事 Celebrity Jokes |
1.8 | ★★☆ |
✗ | SPEECHIO_ASR_ZH00018 | 小猪佩奇 熊出没 |
少儿动画 Children Cartoon |
童话故事、日常 Fairy Tale |
0.9 | ★☆ |
✗ | SPEECHIO_ASR_ZH00019 | CCTV5 NBA 转播 | 体育赛事解说 Sports Game Live |
篮球、NBA NBA Game |
0.7 | ★★★ |
✗ | SPEECHIO_ASR_ZH00020 | 篮球人物 | 纪录片 Documentary |
篮球明星、成长 NBA Super Stars' Life & History |
2.2 | ★★ |
✗ | SPEECHIO_ASR_ZH00021 | 汽车之家评测 | 短视频 VLog |
汽车测评 Car benchmarks, Road driving test |
1.7 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00022 | 小艾大叔 豪宅带看 | 短视频 VLog |
房地产、豪宅 Realestate, Mansion tour |
1.7 | ★★★ |
✗ | SPEECHIO_ASR_ZH00023 | 无聊开箱 Zealer评测 |
短视频 VLog |
产品开箱评测 Unboxing |
2 | ★★★ |
✗ | SPEECHIO_ASR_ZH00024 | 付老师种植技术 | 短视频 VLog |
农业、种植 Agriculture, Planting |
2.7 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00025 | 石国鹏讲历史 | 线下培训 Offline lecture |
历史,古希腊哲学 History, Greek philosophy |
1.3 | ★★☆ |
✗ | SPEECHIO_ASR_ZH00026 | 张震鬼故事 | 广播节目 Broadcasting Program |
鬼故事 Horror Stories |
2.4 | ★★★ |
✗ | SPEECHIO_ASR_ZH00027 | 华语辩论世界杯 | 辩论赛 Debates Contest |
兴趣、技能、成长 Hobby, Skill, Growth |
1.4 | ★★★ |
✗ | SPEECHIO_ASR_ZH00028 | 时政现场同传 | 同声传译 Simultaneous Translation |
时政、社会公共治理 News & Events on Public Governance |
2.1 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00029 | 港台明星访谈 周杰伦,曾志伟 张家辉,陈小春 周星驰 |
口音(港台) HongKong/Taiwan Accents |
娱乐、生活、演艺 Entertainment, Acting, Musics |
1.5 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00030 | 世界青年说 | 口音(老外) Foreigner Accents |
异国文化比较 Cultural Difference |
2 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00031 | 东方甄选 | 直播 broadcast |
带货,英语教学 Online advertising & English Education |
2.4 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00032 | 郎朗钢琴课 | 长视频 long-form video |
音乐乐理,钢琴 Music & piano |
1.7 | ★★☆ |
✗ | SPEECHIO_ASR_ZH00033 | 老石谈芯 | 短视频 VLog |
芯片 chips |
2.8 | ★★★ |
✗ | SPEECHIO_ASR_ZH00034 | 电丸科技AK | 短视频 VLog |
网络 IT Internet tech, IT |
1.4 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00035 | 新氧医美 | 短视频 VLog |
医疗美容 Medical Cosmetology |
1.4 | ★★ |
✗ | SPEECHIO_ASR_ZH00036 | 交通广播 | 交通广播 traffic radio |
路况,娱乐 Traffics |
1.2 | ★★★☆ |
✗ | SPEECHIO_ASR_ZH00037 | 老俞闲聊 | 在线会议 Online meeting |
闲聊 chat |
2.4 | ★★★ |
ops/pull -d <DATASET_ID>
EN Models
编号 MODEL_ID |
类型 TYPE |
厂商/作者 PROVIDER/AUTHOR |
简介 DESCRIPTION |
链接 URL |
---|---|---|---|---|
aliyun_api_en | Cloud | Alibaba | link | |
amazon_api_en | Cloud | Amazon AWS | link | |
baidu_api_en | Cloud | Baidu | link | |
google_api_en | Cloud | link | ||
microsoft_sdk_en | Cloud | Microsoft Azure | link | |
tencent_api_en | Cloud | Tencent | link | |
coqui_model_en | Local (supervised) |
coqui | link | |
deepspeech_model_en | Local (supervised) |
deepspeech | link | |
k2_gigaspeech | Local (supervised) |
k2-fsa | link | |
nemo_conformer_ctc_large_en | Local (supervised) |
NVidia NeMo | link | |
nemo_conformer_transducer_xlarge_en | Local (supervised) |
NVidia NeMo | link | |
vosk_model_en | Local (supervised) |
alphacephei | link | |
vosk_model_en_large | Local (supervised) |
alphacephei | link | |
whisper_large | Local (supervised) |
OpenAI | link | |
data2vec_audio_large_ft_libri_960h | Local | Facebook AI | link | |
hubert_xlarge_ft_libri_960h | Local | Facebook AI | link | |
wav2vec2_large_robust_ft_libri_960h | Local | Facebook AI | link | |
wavlm_base_plus_ft_libri_clean_100h | Local | Microsoft patrickvonplaten |
link |
ZH Models
Cloud Models
编号 MODEL_ID |
类型 TYPE |
厂商 PROVIDER |
简介 DESCRIPTION |
链接 URL |
---|---|---|---|---|
aispeech_api_zh | Cloud | 思必驰 AISpeech |
思必驰开放平台 | link |
aliyun_api_zh | Cloud | 阿里巴巴 Alibaba |
阿里云 - 一句话识别 | link |
aliyun_ftasr_api_zh | Cloud | 阿里巴巴 Alibaba |
阿里云 - 文件识别(非流式) | link |
baidu_pro_api_zh | Cloud | 百度 Baidu |
百度智能云 (极速版) |
link |
bilibili_api_zh | Cloud | 哔哩哔哩 bilibili |
哔哩哔哩AI开放平台 | not available yet |
hiasr_api_zh | Cloud | 喜马拉雅 ximalaya |
喜马拉雅AI开放平台 (转写,非流式) |
not available yet |
iflytek_lfasr_api_zh | Cloud | 讯飞 IFlyTek |
讯飞开放平台 (转写,非流式) |
link |
microsoft_sdk_zh | Cloud | 微软 Microsoft |
Azure | link |
tencent_api_zh | Cloud | 腾讯 Tencent |
腾讯云 | link |
yitu_api_zh | Cloud | 依图 YituTech |
依图语音开放平台 | link |
Local Models
编号 MODEL_ID |
类型 TYPE |
作者 AUTHOR |
简介 DESCRIPTION |
---|---|---|---|
speechio_kaldi_multicn | Local | Xingyu NA(那兴宇) | Kaldi multi_cn recipe |
vosk_model_cn | Local | alphacephei | Chinese engine of Vosk |
Cloud Models
are Cloud API clients(e.g. Google Cloud, Azure), stored in this github repo already.Local Models
are local ASR engines(e.g. pretrained models based on open-sourced toolkits) that can be downloaded via:ops/pull -m <MODEL_ID>
Follow this specification. Existing models are good references as well.
With downloaded models & test sets on your machine, benchmarking pipeline can be triggered via:
ops/benchmark -m <MODEL_ID> -d <DATASET_ID>
Rank 排名 | Model 模型 | CER 字错误率 | Date 时间 |
---|---|---|---|
1 | aliyun_ftasr_api_zh | 1.91% | 2022.11 |
2 | microsoft_sdk_zh | 2.42% | 2022.11 |
3 | yitu_api_zh | 2.62 % | 2022.11 |
4 | tencent_api_zh | 2.94% | 2022.11 |
5 | iflytek_lfasr_api_zh | 3.36% | 2022.11 |
6 | aispeech_api_zh | 3.46% | 2022.11 |
7 | baidu_pro_api_zh | 6.64% | 2022.11 |
Rank 排名 | Model 模型 | CER 字错误率 | Date 时间 |
---|---|---|---|
1 | aliyun_ftasr_api_zh | 2.85% | 2022.11 |
2 | yitu_api_zh | 3.16% | 2022.11 |
3 | microsoft_sdk_zh | 3.28% | 2022.11 |
4 | tencent_api_zh | 3.85% | 2022.11 |
5 | iflytek_lfasr_api_zh | 4.05% | 2022.11 |
6 | aispeech_api_zh | 5.19% | 2022.11 |
7 | baidu_pro_api_zh | 8.14% | 2022.11 |
Model 模型 | CER 字错误率 | Date 时间 |
---|---|---|
bilibili_api_zh(*) | 2.43% | 2022.11 |
Detail results (字错误率 CER %)
Test Set ID | 测试场景&内容领域 | bilibili_api_zh | Date 时间 |
---|---|---|---|
SPEECHIO_ASR_ZH00001 | 新闻联播 | 0.61 | 2022.11 |
SPEECHIO_ASR_ZH00002 | 访谈 鲁豫有约 | 2.90 | 2022.11 |
SPEECHIO_ASR_ZH00003 | 电视节目 天下足球 | 0.98 | 2022.11 |
SPEECHIO_ASR_ZH00004 | 场馆演讲 罗振宇跨年 | 1.59 | 2022.11 |
SPEECHIO_ASR_ZH00005 | 在线教育 李永乐 科普 | 1.49 | 2022.11 |
SPEECHIO_ASR_ZH00006 | 直播 王者荣耀 张大仙&骚白 | 5.88 | 2022.11 |
SPEECHIO_ASR_ZH00007 | 直播 带货 李佳琪&薇娅 | 6.26 | 2022.11 |
SPEECHIO_ASR_ZH00008 | 线下培训 老罗语录 | 3.78 | 2022.11 |
SPEECHIO_ASR_ZH00009 | 播客 故事FM | 3.26 | 2022.11 |
SPEECHIO_ASR_ZH00010 | 播客 创业内幕 | 3.59 | 2022.11 |
SPEECHIO_ASR_ZH00011 | 在线教育 罗翔 刑法法考 | 1.92 | 2022.11 |
SPEECHIO_ASR_ZH00012 | 在线教育 张雪峰 考研 | 2.12 | 2022.11 |
SPEECHIO_ASR_ZH00013 | 短视频 影剪 谷阿莫&牛叔说电影 | 3.07 | 2022.11 |
SPEECHIO_ASR_ZH00014 | 短视频 美式&烹饪 | 3.74 | 2022.11 |
SPEECHIO_ASR_ZH00015 | 评书 单田芳 白眉大侠 | 4.79 | 2022.11 |
Model 模型 | CER 字错误率 | Date 时间 |
---|---|---|
bilibili_api_zh(*) | 2.82 % | 2022.11 |
Detail all results (字错误率 CER %)
Test Set ID | 测试场景&内容领域 | bilibili_api_zh | Date 时间 |
---|---|---|---|
SPEECHIO_ASR_ZH00001 | 新闻联播 | 0.61 | 2022.11 |
SPEECHIO_ASR_ZH00002 | 访谈 鲁豫有约 | 2.90 | 2022.11 |
SPEECHIO_ASR_ZH00003 | 电视节目 天下足球 | 0.98 | 2022.11 |
SPEECHIO_ASR_ZH00004 | 场馆演讲 罗振宇跨年 | 1.59 | 2022.11 |
SPEECHIO_ASR_ZH00005 | 在线教育 李永乐 科普 | 1.49 | 2022.11 |
SPEECHIO_ASR_ZH00006 | 直播 王者荣耀 张大仙&骚白 | 5.88 | 2022.11 |
SPEECHIO_ASR_ZH00007 | 直播 带货 李佳琪&薇娅 | 6.26 | 2022.11 |
SPEECHIO_ASR_ZH00008 | 线下培训 老罗语录 | 3.78 | 2022.11 |
SPEECHIO_ASR_ZH00009 | 播客 故事FM | 3.26 | 2022.11 |
SPEECHIO_ASR_ZH00010 | 播客 创业内幕 | 3.59 | 2022.11 |
SPEECHIO_ASR_ZH00011 | 在线教育 罗翔 刑法法考 | 1.92 | 2022.11 |
SPEECHIO_ASR_ZH00012 | 在线教育 张雪峰 考研 | 2.12 | 2022.11 |
SPEECHIO_ASR_ZH00013 | 短视频 影剪 谷阿莫&牛叔说电影 | 3.07 | 2022.11 |
SPEECHIO_ASR_ZH00014 | 短视频 美式&烹饪 | 3.74 | 2022.11 |
SPEECHIO_ASR_ZH00015 | 评书 单田芳 白眉大侠 | 4.79 | 2022.11 |
SPEECHIO_ASR_ZH00016 | 相声 德云社专场 | 3.04 | 2022.11 |
SPEECHIO_ASR_ZH00017 | 脱口秀 吐槽大会 | 2.96 | 2022.11 |
SPEECHIO_ASR_ZH00018 | 少儿卡通 小猪佩奇&熊出没 | 2.03 | 2022.11 |
SPEECHIO_ASR_ZH00019 | 体育赛事解说 NBA比赛 | 2.25 | 2022.11 |
SPEECHIO_ASR_ZH00020 | 纪录片 篮球人物 | 1.54 | 2022.11 |
SPEECHIO_ASR_ZH00021 | 短视频 汽车之家 汽车评测 | 1.76 | 2022.11 |
SPEECHIO_ASR_ZH00022 | 短视频 小艾大叔 豪宅带看 | 3.39 | 2022.11 |
SPEECHIO_ASR_ZH00023 | 短视频 开箱视频 Zeal&无聊开箱 | 2.24 | 2022.11 |
SPEECHIO_ASR_ZH00024 | 短视频 付老师 农业种植 | 5.05 | 2022.11 |
SPEECHIO_ASR_ZH00025 | 线下课堂 石国鹏 古希腊哲学 | 3.31 | 2022.11 |
SPEECHIO_ASR_ZH00026 | 广播电台节目 张震鬼故事 | 3.74 | 2022.11 |
SPEECHIO_ASR_ZH00027 | 华语大学生辩论赛 兴趣,技能,成长 | 2.14 | 2022.11 |
SPEECHIO_ASR_ZH00028 | 同声传译:时政&社会公共治理 | 2.07 | 2022.11 |
SPEECHIO_ASR_ZH00029 | 港台口音:港台明星访谈 | 4.10 | 2022.11 |
SPEECHIO_ASR_ZH00030 | 老外口音:《世界青年说》 | 4.00 | 2022.11 |
SPEECHIO_ASR_ZH00031 | 直播带货 东方甄选 | 3.97 | 2022.11 |
SPEECHIO_ASR_ZH00032 | 音乐 郎朗钢琴课 | 4.14 | 2022.11 |
SPEECHIO_ASR_ZH00033 | 芯片 老石谈芯 | 2.83 | 2022.11 |
SPEECHIO_ASR_ZH00034 | 网络IT 电丸科技AK | 5.80 | 2022.11 |
SPEECHIO_ASR_ZH00035 | 新氧医美 | 1.24 | 2022.11 |
SPEECHIO_ASR_ZH00036 | 交通广播 信不信由你 | 6.17 | 2022.11 |
SPEECHIO_ASR_ZH00037 | 在线会议聊天 老俞闲话 | 3.08 | 2022.11 |
note: models with (*)
marker can be found in model zoo, but not universally available to public yet.
Email: [email protected]