- Install Java (version 11 is the recommended one)
- Install Python (3.11)
- Create venv and install required packages
pip install apache-flink
pip install silero-vad
andpip install soundfile
(needed for audio I/O, see other options)- depending on the system
- for macOS:
WHISPER_COREML=1 pip install git+https://github.com/absadiki/pywhispercpp
- for Nvidia GPU:
GGML_CUDA=1 pip install git+https://github.com/absadiki/pywhispercpp
- for macOS:
- also for mp3 testing
pip install pydub
+ ffmpeg should be installed (installation method depends on the OS)
- Then run
python3 main.py
or submit the job on the cluster (instruction)
To submit a job on a cluster read this.
- Only one python binding pywhispercpp supports GPU
- and this binding can be built like this
- For CoreML this model should be downloaded and moved to
/${USER_HOME}/Library/Application Support/pywhispercpp/models
- pyannote audio
- Silero VAD
- webrtcvad