Skip to content

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Notifications You must be signed in to change notification settings

lijp22/Leaderboard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpeechColab ASR leaderboard

1. Overview

"If you can’t measure it, you can’t improve it." -- Peter Drucker

SpeechIO leaderboard serves as an ASR benchmarking platform by providing 3 components:

  1. TestSet Zoo: A collection of test sets covering wide range of speech recognition tasks & scenarios

  2. Model Zoo: A collection of models including commercial APIs & open-sourced models

  3. Benchmarking Pipeline: a simple & well-specified pipeline to take care of data preparation / recognition / post processing / error rate evaluation.

With SpeechIO leaderboard, anyone should be able to benchmark, reproduce, compare all these ASR systems locally

Overview

2. TestSet Zoo

Academic Test Sets

已公开
UNLOCKED
编号
TEST_SET_ID
说明
DESCRIPTION
语言
LANGUAGE
AISHELL1_TEST test set of AISHELL-1 zh
AISHELL2_IOS_TEST test set of AISHELL-2 (iOS channel) zh
AISHELL2_ANDROID_TEST test set of AISHELL-2 (Android channel) zh
AISHELL2_MIC_TEST test set of AISHELL-2 (Microphone channel) zh
LIBRISPEECH_TEST_CLEAN "test_clean" set of LibriSpeech en
LIBRISPEECH_TEST_OTHER "test_other" set of LibriSpeech en
GIGASPEECH_V1.0.0_DEV dev set of GigaSpeech en
GIGASPEECH_V1.0.0_TEST test set of GigaSpeech en

SpeechIO Test Sets

SpeechIO test sets are carefully curated by SpeechIO authors, crawled from publicly available sources (Youtube, TV programs, Podcast etc), covering various well-known acoustic scenarios(AM) and topic domains(LM & vocabulary), labeled by payed professional annotators.

已公开
UNLOCKED
编号
TEST_SET_ID
名称
NAME
场景
SCENARIO
内容领域
TOPIC
时长
HOURS
难度(1-5)
DIFFICULTY
SPEECHIO_ASR_ZH00000 调试集
for debugging
视频会议、论坛演讲
conference & speech
经济、货币、金融
economy, currency, finance
1.0 ★★☆
SPEECHIO_ASR_ZH00001 新闻联播 新闻播报
TV News
时政
news & politics
9
SPEECHIO_ASR_ZH00002 鲁豫有约 访谈电视节目
TV interview
名人工作/生活
celebrity & film & music & daily
3 ★★☆
SPEECHIO_ASR_ZH00003 天下足球 专题电视节目
TV program
足球
Sports & Football & Worldcup
2.7 ★★☆
SPEECHIO_ASR_ZH00004 罗振宇跨年演讲 会场演讲
Stadium Public Speech
社会、人文、商业
Society & Culture & Business Trend
2.7 ★★
SPEECHIO_ASR_ZH00005 李永乐讲堂 在线教育
Online Education
科普
Popular Science
4.4 ★★★
SPEECHIO_ASR_ZH00006 王者荣耀
张大仙 & 骚白
直播
Live Broadcasting
游戏
Game
1.6 ★★★☆
SPEECHIO_ASR_ZH00007 直播带货
李佳琪 & 薇娅
直播
Live Broadcasting
电商、美妆
Makeup & Online shopping/advertising
0.9 ★★★★☆
SPEECHIO_ASR_ZH00008 老罗语录 线下培训
Offline lecture
段子、做人
Life & Purpose & Ethics
1.3 ★★★★☆
SPEECHIO_ASR_ZH00009 故事FM 播客
Podcast
人生故事、见闻
Ordinary Life Story Telling
4.5 ★★☆
SPEECHIO_ASR_ZH00010 创业内幕 播客
Podcast
创业、产品、投资
Startup & Enterprenuer & Product & Investment
4.2 ★★☆
SPEECHIO_ASR_ZH00011 罗翔刑法法考 在线教育
Online Education
法律 法考
Law & Lawyer Qualification Exams
3.4 ★★☆
SPEECHIO_ASR_ZH00012 张雪峰考研 在线教育
Online Education
考研 高校报考
University & Graduate School Entrance Exams
3.4 ★★★☆
SPEECHIO_ASR_ZH00013 谷阿莫
牛叔说电影
短视频
VLog
电影剪辑
Movie Cuts
1.8 ★★★
SPEECHIO_ASR_ZH00014 贫穷料理
琼斯爱生活
短视频
VLog
美食、烹饪
Food & Cooking & Gourmet
1 ★★★☆
SPEECHIO_ASR_ZH00015 单田芳 白眉大侠 评书
Traditional Podcast
江湖、武侠
Kongfu Fiction
2.2 ★★☆
SPEECHIO_ASR_ZH00016 德云社演出 剧场相声
Theater Crosstalk Show
包袱段子
Funny Stories
1 ★★★
SPEECHIO_ASR_ZH00017 吐槽大会 脱口秀电视节目
Standup Comedy
明星糗事
Celebrity Jokes
1.8 ★★☆
SPEECHIO_ASR_ZH00018 小猪佩奇
熊出没
少儿动画
Children Cartoon
童话故事、日常
Fairy Tale
0.9 ★☆
SPEECHIO_ASR_ZH00019 CCTV5 NBA 转播 体育赛事解说
Sports Game Live
篮球、NBA
NBA Game
0.7 ★★★
SPEECHIO_ASR_ZH00020 篮球人物 纪录片
Documentary
篮球明星、成长
NBA Super Stars' Life & History
2.2 ★★
SPEECHIO_ASR_ZH00021 汽车之家评测 短视频
VLog
汽车测评
Car benchmarks, Road driving test
1.7 ★★★☆
SPEECHIO_ASR_ZH00022 小艾大叔 豪宅带看 短视频
VLog
房地产、豪宅
Realestate, Mansion tour
1.7 ★★★
SPEECHIO_ASR_ZH00023 无聊开箱
Zealer评测
短视频
VLog
产品开箱评测
Unboxing
2 ★★★
SPEECHIO_ASR_ZH00024 付老师种植技术 短视频
VLog
农业、种植
Agriculture, Planting
2.7 ★★★☆
SPEECHIO_ASR_ZH00025 石国鹏讲历史 线下培训
Offline lecture
历史,古希腊哲学
History, Greek philosophy
1.3 ★★☆
SPEECHIO_ASR_ZH00026 张震鬼故事 广播节目
Broadcasting Program
鬼故事
Horror Stories
2.4 ★★★
SPEECHIO_ASR_ZH00027 华语辩论世界杯 辩论赛
Debates Contest
兴趣、技能、成长
Hobby, Skill, Growth
1.4 ★★★
SPEECHIO_ASR_ZH00028 时政现场同传 同声传译
Simultaneous Translation
时政、社会公共治理
News & Events on Public Governance
2.1 ★★★☆
SPEECHIO_ASR_ZH00029 港台明星访谈
周杰伦,曾志伟
张家辉,陈小春
周星驰
口音(港台)
HongKong/Taiwan Accents
娱乐、生活、演艺
Entertainment, Acting, Musics
1.5 ★★★☆
SPEECHIO_ASR_ZH00030 世界青年说 口音(老外)
Foreigner Accents
异国文化比较
Cultural Difference
2 ★★★☆

How to get a test set

To download an unlocked test set from cloud to your local dir leaderboard/datasets/<TEST_SET_ID>:

ops/pull dataset <TEST_SET_ID>

3. Model Zoo

There are 2 types of models in model zoo: cloud API model & pretrained model:

ZH models

Cloud API Models (ZH)

已公开
UNLOCKED
编号
MODEL_ID
类型
TYPE
厂商
PROVIDER
简介
DESCRIPTION
链接
URL
aispeech_api_zh Cloud API 思必驰
AISpeech
思必驰开放平台 official link
aliyun_api_zh Cloud API 阿里巴巴
Alibaba
阿里云 - 一句话识别 official link
aliyun_ftasr_api_zh Cloud API 阿里巴巴
Alibaba
阿里云 - 文件识别(非流式) official link
baidu_pro_api_zh Cloud API 百度
Baidu
百度智能云
(极速版)
official link
iflytek_lfasr_api_zh Cloud API 讯飞
IFlyTek
讯飞开放平台
(转写,非流式)
official link
microsoft_sdk_zh Cloud API 微软
Microsoft
Azure official link
tencent_api_zh Cloud API 腾讯
Tencent
腾讯云 official link
yitu_api_zh Cloud API 依图
YituTech
依图语音开放平台 official link

Pretrained Models (ZH)

已公开
UNLOCKED
编号
MODEL_ID
类型
TYPE
作者
AUTHOR
简介
DESCRIPTION
speechio_kaldi_multicn Pretrained Xingyu NA(那兴宇) Kaldi multi_cn recipe
wenet_multi_cn Pretrained Binbin Zhang(张彬彬)@wenet-e2e WeNet multi_cn recipe
vosk_model_cn Pretrained alphacephei Chinese engine of Vosk
wenet_wenetspeech Pretrained Binbin Zhang(张彬彬)@wenet-e2e WeNet wenetspeech recipe

EN models

Cloud API Models (EN)

已公开
UNLOCKED
编号
MODEL_ID
类型
TYPE
厂商
PROVIDER
简介
DESCRIPTION
链接
URL
aliyun_api_en Cloud API 阿里巴巴
Alibaba
阿里云 - 一句话识别 official link
amazon_api_en Cloud API 亚马逊
Amazon
亚马逊云服务平台 official link
baidu_api_en Cloud API 百度
Baidu
百度智能云 official link
google_api_en Cloud API 谷歌
Google
谷歌云 official link
microsoft_sdk_en Cloud API 微软
Microsoft
Azure official link
tencent_api_en Cloud API 腾讯
Tencent
腾讯云 official link

Pretrained Models (EN)

已公开
UNLOCKED
编号
MODEL_ID
类型
TYPE
作者
AUTHOR
简介
DESCRIPTION
vosk_model_en Pretrained alphacephei English engine of Vosk
vosk_model_en_large Pretrained alphacephei Large English engine of Vosk
deepspeech_model_en Pretrained deepspeech Latest English ASR Model of deepspeech
coqui_model_en Pretrained coqui English engine of coqui
NeMo_conformer_en Pretrained NeMo English engine of NeMo_conformer

How to get a model

  • cloud API models are stored in this github repo Leaderboard/models/*
  • pretrained models are stored in cloud, to download an unlocked model to your local dir (i.e. Leaderboard/models/<MODEL_ID>):
ops/pull model <MODEL_ID>

4. Benchmarking Pipeline

Follow this specification to submit your model and get it benchmarked over all test sets.

With downloaded models & test sets, you can trigger the benchmarking pipeline locally:

ops/leaderboard_runner requests/request.yaml

Here request.yaml specifies a <MODEL_ID> and a list of <TEST_SET_ID> to be tested (see examples in above specification).


5. Ranking

Ranking on unlocked test sets only

Rank排名 Model模型 CER字错误率 Submission date 提交时间
1 yitu_api_zh 2.85 % 2022.05
2 aliyun_api 3.03% 2022.05
3 microsoft_sdk_zh 3.04% 2022.05
4 bilibili_api_zh 3.09% 2022.06
5 aispeech_api_zh 3.39% 2022.05
6 tencent_api_zh 3.56% 2022.05
7 iflytek_lfasr_api_zh 3.69% 2022.05
8 baidu_pro_api_zh 6.64% 2022.05

Ranking on all SpeechIO test sets

Rank排名 Model模型 CER字错误率 Submission date 提交时间
1 yitu_api_zh 3.10 % 2022.05
2 bilibili_api_zh 3.46 % 2022.06
3 microsoft_sdk_zh 3.47% 2022.05
4 aispeech_api_zh 3.63% 2022.05
5 aliyun_api 3.81% 2022.05
6 iflytek_lfasr_api_zh 4.05% 2022.05
7 tencent_api_zh 4.06% 2022.05
8 baidu_pro_api_zh 7.38% 2022.05

6. Latest Leaderboard Report

result


Contacts

Email: [email protected]

About

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.3%
  • Shell 3.6%
  • Dockerfile 2.1%