DeepSpeechNotes is a note taking app that uses Mozilla's DeepSpeech, Web Audio API and Node Voice Activity Detection to transcribe speech into text on the go.
It is my graduation project that was coded from scratch in 4 weeks. My main goal was to showcase current Open Source Text-To-Speech technology.
I wanted to learn something new in terms of technology, so I picked the topics Machine Learning and Speech-To-Text recognition and apply them practically. The result is DeepSpeechNotes, a note taking app that transcribes voice in near real-time.
- React
- Web Audio API
- @picovoice/web-voice-processor
- Socket.io-client
- @emotion/core and styled
- storybook
- Express
- MongoDB
- DeepSpeech
- Node Voice Activity Detection
- Socket.io
To use DeepSpeechNotes, you must meet the following requirements:
- node.js
- npm
- MongoDB
After you moved the repository content to your webspace, you need to run these preconfigured scripts from the repository root directory:
npm prod-prebuild
npm prod-build
- Rename
.env.example
into.env
and change content to desired port and connect to your MongoDB
The Express
server will handle the following requests:
https://your-url.com/storybook
will route to the storybook build- All other requests (including
https://your-url.com
) will route to the React application build (="client/build"
)
You need a pretrained model for DeepSpeech to work. Please look at this readme to find out how to download the model.
Please have a look at open issues and maybe add your own 💡.
Contributions are greatly appreciated:
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE for more information.
Marc Haupt - Twitter: @Marc_Haupt - GitHub: hauptdigital - [email protected]
Project Link: https://github.com/hauptdigital/deepspeech-notes