Skip to content

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Notifications You must be signed in to change notification settings

lijp22/Leaderboard

Repository files navigation

SpeechIO ASR leaderboard

1. Overview

"If you can’t measure it, you can’t improve it." -- by Peter Drucker

In domain of Automatic Speech Recognition(ASR), people claim SOTA in research papers, in industrial PR articles. The claim of SOTA is problematic because:

  • For industry, there is no objective and quantative benchmark on how these commercial APIs perform in real-life scenarios, at least in public domain.
  • For academia, it is becoming harder today to compare ASR models due to the fragmentation of deep learning frameworks and speech toolkits.
  • How are academic SOTA and industrial SOTA related ?

Overview As above figure shows, SpeechIO leaderboard serves as an ASR benchmarking platform, by providing 3 components:

  1. TestSet Zoo:
  • SpeechIO test sets: carefully curated by SpeechIO authors, crawled from publicly available sources(Youtube, TV programs, Podcast etc), covering various common acoustic scenarios(AM) and content domains(LM & vocabulary) that are familiar to the public, labeled by professional annotators with great cautions.
  • open-sourced test sets: collected from all sorts of open-sourced datasets
  • open for people who are willing to contribute their own test sets
  1. Model Zoo
  • aggregates commercial ASR APIs (e.g. google, aws, baidu, alibaba, tencent, iflytek, etc)
  • incorporate well-known open-sourced models
  • open for people who are willing to share/publish their ASR models
  1. an open benchmarking pipeline:
  • defines a simplest-possible contract on the format of test sets, recognition result etc.
  • defines a simplest-possible benchmarking interface for model submitters
  • a fully automated pipeline to perform prepare -> recognize -> process -> evaluate, leaderboard users don't need to write code.

With SpeechIO leaderboard, anyone can benchmark/reproduce/compare performances with arbitrary combinations between test-set zoo and model zoo, by simply filling a request form example form


2. TestSet Zoo

Test Sets (ZH)

编号
ID
名称
Name
场景
Scenario
内容领域
Topic Domain
时长
hours
难度(1-5)
Difficulty
SPEECHIO_ASR_ZH00000 接入调试集
For leaderboard submitter debugging
视频会议、论坛演讲
video conference & forum speech
经济、货币、金融
economy, currency, finance
1.0 ★★☆
SPEECHIO_ASR_ZH00001 新闻联播 新闻播报
TV News
时政
news & politics
9
SPEECHIO_ASR_ZH00002 鲁豫有约 访谈电视节目
TV interview
名人工作/生活
celebrity & film & music & daily
3 ★★☆
SPEECHIO_ASR_ZH00003 天下足球 专题电视节目
TV program
足球
Sports & Football & Worldcup
2.7 ★★☆
SPEECHIO_ASR_ZH00004 罗振宇跨年演讲 会场演讲
Stadium Public Speech
社会、人文、商业
Society & Culture & Business Trend
2.7 ★★
SPEECHIO_ASR_ZH00005 李永乐老师在线讲堂 在线教育
Online Education
科普
Popular Science
4.4 ★★★
SPEECHIO_ASR_ZH00006 张大仙 & 骚白 王者荣耀直播 直播
Live Broadcasting
游戏
Game
1.6 ★★★☆
SPEECHIO_ASR_ZH00007 李佳琪 & 薇娅 直播带货 直播
Live Broadcasting
电商、美妆
Makeup & Online shopping/advertising
0.9 ★★★★☆
SPEECHIO_ASR_ZH00008 老罗语录 线下培训
Offline lecture
段子、做人
Life & Purpose & Ethics
1.3 ★★★★☆
SPEECHIO_ASR_ZH00009 故事FM 播客
Podcast
人生故事、见闻
Ordinary Life Story Telling
4.5 ★★☆
SPEECHIO_ASR_ZH00010 创业内幕 播客
Podcast
创业、产品、投资
Startup & Enterprenuer & Product & Investment
4.2 ★★☆
SPEECHIO_ASR_ZH00011 罗翔 刑法法考培训讲座 在线教育
Online Education
法律 法考
Law & Lawyer Qualification Exams
3.4 ★★☆
SPEECHIO_ASR_ZH00012 张雪峰 考研线上小讲堂 在线教育
Online Education
考研 高校报考
University & Graduate School Entrance Exams
3.4 ★★★☆
SPEECHIO_ASR_ZH00013 谷阿莫&牛叔说电影 短视频
VLog
电影剪辑
Movie Cuts
1.8 ★★★
SPEECHIO_ASR_ZH00014 贫穷料理 & 琼斯爱生活 短视频
VLog
美食、烹饪
Food & Cooking & Gourmet
1 ★★★☆
SPEECHIO_ASR_ZH00015 单田芳 白眉大侠 评书
Traditional Podcast
江湖、武侠
Kongfu Fiction
2.2 ★★☆
SPEECHIO_ASR_ZH00016 德云社相声演出 剧场相声
Theater Crosstalk Show
包袱段子
Funny Stories
1 ★★★
SPEECHIO_ASR_ZH00017 吐槽大会 脱口秀电视节目
Standup Comedy
明星糗事
Celebrity Jokes
1.8 ★★☆
SPEECHIO_ASR_ZH00018 小猪佩奇 & 熊出没 少儿动画
Children Cartoon
童话故事、日常
Fairy Tale
0.9 ★☆
SPEECHIO_ASR_ZH00019 CCTV5 NBA 比赛转播 体育赛事解说
Sports Game Live
篮球、NBA
NBA Game
0.7 ★★★
SPEECHIO_ASR_ZH00020 篮球人物 纪录片
Documentary
篮球明星、成长
NBA Super Stars' Life & History
2.2 ★★
SPEECHIO_ASR_ZH00021 汽车之家 车辆评测 短视频
VLog
汽车测评
Car benchmarks, Road driving test
1.7 ★★★☆
SPEECHIO_ASR_ZH00022 小艾大叔 豪宅带看 短视频
VLog
房地产、豪宅
Realestate, Mansion tour
1.7 ★★★
SPEECHIO_ASR_ZH00023 无聊开箱 & Zealer评测 短视频
VLog
产品开箱评测
Unboxing
2 ★★★
SPEECHIO_ASR_ZH00024 付老师种植技术 短视频
VLog
农业、种植
Agriculture, Planting
2.7 ★★★☆
SPEECHIO_ASR_ZH00025 石国鹏讲古希腊哲学 线下培训
Offline lecture
历史,古希腊哲学
History, Greek philosophy
1.3 ★★☆


3. Model Zoo

Commercial Models (ZH)

编号
MODEL_ID
类型
type
模型作者/所有人
model author/owner
简介
description
链接
url
aispeech_api Cloud API 思必驰
AISpeech
思必驰开放平台 https://cloud.aispeech.com
aliyun_api Cloud API 阿里巴巴
Alibaba
阿里云 https://ai.aliyun.com/nls/asr
baidu_pro_api Cloud API 百度
Baidu
百度智能云(极速版) https://cloud.baidu.com/product/speech/asr
Cloud API 讯飞
IFlyTek
讯飞开放平台(听写服务) https://www.xfyun.cn/services/voicedictation
microsoft_api Cloud API 微软
Microsoft
Azure https://azure.microsoft.com/zh-cn/services/cognitive-services/speech-services/
sogou_api Cloud API 搜狗
Sogou
AI开放平台 https://ai.sogou.com/product/one_recognition/
tencent_api Cloud API 腾讯
Tencent
腾讯云 https://cloud.tencent.com/product/asr
yitu_api Cloud API 依图
YituTech
依图语音开放平台 https://speech.yitutech.com

Open-Sourced Models (ZH)

编号
MODEL_ID
类型
type
模型作者/所有人
model author/owner
简介
description
链接
url
speechio_kaldi_multicn pretrained ASR model 那兴宇
Xingyu NA
Kaldi预训练模型
Kaldi pretrained ASR
based on Kaldi recipe https://github.com/kaldi-asr/kaldi/tree/master/egs/multi_cn/s5


4. Benchmarking Pipeline

How to submit your own model and get your model benchmarked ?

Follow submission guideline here HOW_TO_SUBMIT.md


5. Latest Leaderboard Report

result


About

SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.3%
  • Shell 3.6%
  • Dockerfile 2.1%