I often find that my writing speed is constrained by my typing abilities. One solution is to use automatic speech recognition, but it has three main disadvantages. Firstly, its accuracy can be limited. Secondly, it tends to be quite literal, requiring significant time for editing and revising. Lastly, it is difficult to go back and make changes if we change our thoughts midway. To address these issues, I developed this project using OpenAI's Whisper API for transcription, followed by the GPT-4 API to paraphrase the text in a more organized and logical manner by considering the entire content.
To use the application, simply start the development server, open the web app in your browser, and start speaking. The application will capture your voice, transcribe it using OpenAI's Whisper API, and send the transcribed text to ChatGPT for further processing. The corrected and processed text will then be displayed on the screen. Configuration & Environment Setup
Follow these steps to set up the project environment:
- Clone the repository
cd VoiceNoteTaker
- [Optional] Create a virtual environment:
python -m venv venv
- [Optional] Activate the virtual environment:
venv\Scripts\activate
for Windows orsource venv/bin/activate
for Linux/Mac. - Install the required dependencies:
pip install -r requirements.txt
- Set up your OpenAI API key as an environment variable:
set OPENAI_API_KEY=your_api_key_here
for Windows andexport OPENAI_API_KEY=your_api_key_here
for Linux/Mac. - [Optional] If you want to run it as a Telegram bot, follow this tutorial to get a bot API token, and add it to your
.bashrc
or.zshrc
likeexport TELEGRAM_BOT_TOKEN=your_token_here
. - For the standalone website, run the development server:
python main.py
. Open your browser and navigate to http://localhost:5000 to access the web app. For the telegram bot, runpython telegram_bot.py
. And then talk to your registered bot to access the features.
This project is set up to use a development server, which is not suitable for production use. Please ensure that you do not deploy the application with the development server for production purposes. Instead, use a production-ready web server, such as Gunicorn or uWSGI, in conjunction with a reverse proxy like Nginx or Apache.
One click deploy a forked version on Railway: