forked from 78/xiaozhi
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
394 additions
and
128 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,130 +1,6 @@ | ||
# Logs | ||
logs | ||
*.log | ||
npm-debug.log* | ||
yarn-debug.log* | ||
yarn-error.log* | ||
lerna-debug.log* | ||
.pnpm-debug.log* | ||
|
||
# Diagnostic reports (https://nodejs.org/api/report.html) | ||
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json | ||
|
||
# Runtime data | ||
pids | ||
*.pid | ||
*.seed | ||
*.pid.lock | ||
|
||
# Directory for instrumented libs generated by jscoverage/JSCover | ||
lib-cov | ||
|
||
# Coverage directory used by tools like istanbul | ||
coverage | ||
*.lcov | ||
|
||
# nyc test coverage | ||
.nyc_output | ||
|
||
# Grunt intermediate storage (https://gruntjs.com/creating-plugins#storing-task-files) | ||
.grunt | ||
|
||
# Bower dependency directory (https://bower.io/) | ||
bower_components | ||
|
||
# node-waf configuration | ||
.lock-wscript | ||
|
||
# Compiled binary addons (https://nodejs.org/api/addons.html) | ||
build/Release | ||
|
||
# Dependency directories | ||
node_modules/ | ||
jspm_packages/ | ||
|
||
# Snowpack dependency directory (https://snowpack.dev/) | ||
web_modules/ | ||
|
||
# TypeScript cache | ||
*.tsbuildinfo | ||
|
||
# Optional npm cache directory | ||
.npm | ||
|
||
# Optional eslint cache | ||
.eslintcache | ||
|
||
# Optional stylelint cache | ||
.stylelintcache | ||
|
||
# Microbundle cache | ||
.rpt2_cache/ | ||
.rts2_cache_cjs/ | ||
.rts2_cache_es/ | ||
.rts2_cache_umd/ | ||
|
||
# Optional REPL history | ||
.node_repl_history | ||
|
||
# Output of 'npm pack' | ||
*.tgz | ||
|
||
# Yarn Integrity file | ||
.yarn-integrity | ||
|
||
# dotenv environment variable files | ||
tmp/ | ||
.env | ||
.env.development.local | ||
.env.test.local | ||
.env.production.local | ||
.env.local | ||
|
||
# parcel-bundler cache (https://parceljs.org/) | ||
.cache | ||
.parcel-cache | ||
|
||
# Next.js build output | ||
.next | ||
out | ||
|
||
# Nuxt.js build / generate output | ||
.nuxt | ||
dist | ||
|
||
# Gatsby files | ||
.cache/ | ||
# Comment in the public line in if your project uses Gatsby and not Next.js | ||
# https://nextjs.org/blog/next-9-1#public-directory-support | ||
# public | ||
|
||
# vuepress build output | ||
.vuepress/dist | ||
|
||
# vuepress v2.x temp and cache directory | ||
.temp | ||
.cache | ||
|
||
# Docusaurus cache and generated files | ||
.docusaurus | ||
|
||
# Serverless directories | ||
.serverless/ | ||
|
||
# FuseBox cache | ||
.fusebox/ | ||
|
||
# DynamoDB Local files | ||
.dynamodb/ | ||
|
||
# TernJS port file | ||
.tern-port | ||
|
||
# Stores VSCode versions used for testing VSCode extensions | ||
.vscode-test | ||
|
||
# yarn v2 | ||
.yarn/cache | ||
.yarn/unplugged | ||
.yarn/build-state.yml | ||
.yarn/install-state.gz | ||
.pnp.* | ||
package-lock.json | ||
TODO.md | ||
*.pyc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
# ASR Task Client | ||
|
||
这个服务用于接收语音识别任务,并将识别结果发送给调用方。 | ||
|
||
可以运行于没有公网 IP 的环境,通过 WebSocket 连接到任务分发服务器,以便于快速扩容。 | ||
|
||
|
||
### 使用到以下开源模型: | ||
|
||
VAD:<a href="https://modelscope.cn/models/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch">speech_fsmn_vad_zh-cn-16k-common-pytorch</a> | ||
|
||
ASR:<a href="https://modelscope.cn/models/iic/SenseVoiceSmall">SenseVoiceSmall</a> | ||
|
||
Embedding:<a href="https://modelscope.cn/models/iic/speech_eres2netv2w24s4ep4_sv_zh-cn_16k-common">ERes2NetV2_w24s4ep4</a> | ||
|
||
### 效率: | ||
|
||
VAD < 1ms,ASR < 50ms,Embedding < 50ms | ||
|
||
|
||
## 硬件要求 | ||
|
||
GPU 显存:4GB | ||
|
||
使用 CPU 增加 5 ~ 10 倍的请求耗时。 | ||
|
||
## 运行环境 | ||
|
||
需要配置环境变量 | ||
|
||
ASR_TASK_SERVER_URL=wss:// | ||
|
||
|
||
|
||
```bash | ||
conda create -n xiaozhi python=3.12 | ||
codna activate xiaozhi | ||
|
||
pip install -r requirements.txt | ||
python asr_task_client.py | ||
``` | ||
|
||
## WebSocket 协议 | ||
|
||
### 发起请求 | ||
|
||
JSON: | ||
|
||
1、detect 开始检测任务 | ||
|
||
```json | ||
{ "session_id": "xxx", "type": "detect", "words": "小智" } | ||
``` | ||
|
||
2、finish 完成检测任务 | ||
|
||
```json | ||
{ "session_id": "xxx", "type": "finish" } | ||
``` | ||
|
||
二进制: | ||
|
||
每次发送的二进制数据包含 session_id 和 PCM 音频数据两部分,格式如下: | ||
|
||
``` | ||
session_id length,uint32 big | ||
session_id 字符串 | ||
pcm length,uint32 big | ||
pcm data | ||
``` | ||
|
||
### 接收结果 | ||
|
||
JSON: | ||
|
||
```json | ||
{ "session_id": "xxx", "type": "reply", "content": "文本内容", "embedding": "音频向量", "url": "音频下载地址" } | ||
``` | ||
|
Oops, something went wrong.