Just A Rather Very Intelligent System, J.A.R.V.I.S. Tony Stark's artificial intelligence assistant: Speech To Text to LLM to Text To Speech, displayed in a web interface.
- 🎤 The user speaks into the microphone
- ⌨️ Voice is converted to text using Deepgram Nova
- 🤖 Text is sent to OpenAI's GPT-3 API to generate a response
- 📢 Response is converted to speech using Deepgram Aura
- 🔊 Speech is played using Pygame
- 💻 Conversation is displayed in a webpage using Taipy
Python 3.11
Pip (tested on v24.0)
Deepgram SDK 3.2.6
OpenAI 1.10
Make sure you have the following API keys:
- Clone the repository
git clone https://github.com/hungqng/JARVIS-voice-virtual-assistant.git
- Install the requirements
pip install -r requirements.txt
- Create a
.env
file in the root directory and add the following variables:
DEEPGRAM_API_KEY=XXX...XXX
OPENAI_API_KEY=sk-XXX...XXX
- Run
display.py
to start the web interface
python display.py
- In another terminal, run
main.py
to start the voice assistant
python main.py
- Once ready, both the web interface and the terminal will show
Listening...
- You can now speak into the microphone
- Once you stop speaking, it will show
Stopped listening
- It will then start processing your request
- Once the response is ready, it will show
Speaking...
- The response will be played and displayed in the web interface.
Here is an example:
Listening...
Done listening
Finished transcribing in 0.28 seconds.
Finished generating a response in 0.85 seconds.
Finished generating audio in 0.23 seconds.
Speaking...
--- USER: Hello, Jarvis.
--- JARVIS: Hello, Mr. Stark. How can I assist you today?
Listening...
...