Skip to content

Deep-speech react app to test trained models,to visualize the speech to text process, to record the audio from mic to wav using the webaudio API, or to create/use a custom open speech-to-text API.

License

Notifications You must be signed in to change notification settings

emp7eror/deep-talk

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep-talk

Deep-speech react app to test trained models,to visualize the speech to text process, to record the audio from mic to wav using the webaudio API, or to create/use a custom open speech-to-text API.

Live Demo.

deep-talk

Clone it

git clone https://github.com/buddyeorl/deep-talk.git

Download the Model and Scorer

This app needs two files to work, the acoustic model: deepspeech-0.8.2-models.pbmm

and the following scorer: deepspeech-0.8.2-models.scorer

download both files to /server/index/

Build

In the terminal in the repo root directory npm run build .

Server

Now in the terminal cd server && node server.js

*** defaults to http://localhost:3001 ***

API calls

API calls to

https://deep-talk.azurewebsites.net/api/v1/getVoice

POST requests accepts 16kHZ mono 16bits WAV audio files in multipart form data,the field name should be 'audio'

sample responses:

No audio file:

{
    "message": "No audio file has been received"
}

No recognition:

{
    "error": "No speech was recognized"
}

Success:

{
    "message": "success",
    "data": "two three"
}

Important audio info

Please note that the app resamples the audio recorded to 16kHZ mono 16bits(as used when training the model), I might add different samplerates recording options if requested.

Also note that this app will recognize pauses in speech and trimm the audio files and speech recognition responses accordingly.

Author

Github Alex Lizarraga

Portfolio www.alexcode.io

Email [email protected]

About

Deep-speech react app to test trained models,to visualize the speech to text process, to record the audio from mic to wav using the webaudio API, or to create/use a custom open speech-to-text API.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 92.7%
  • CSS 3.7%
  • HTML 3.6%