Skip to content

abinthomasonline/vad-chunking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Audio Segmentation using Voice Activity Detection

Overview

Welcome to the Audio Segmentation project! This repository contains a Python class called VADChunker designed for the segmentation of audio data using Voice Activity Detection (VAD). The primary goal of this class is to process randomly fragmented audio bytes and generate audio segments with proper phrases. This segmentation is particularly useful for preprocessing live audio streams before passing them to a transcriber.

The VADChunker class leverages a deep-learning model, silero-vad for VAD to identify and isolate regions of audio that contain speech. By incorporating this class into your audio processing pipeline, you can improve the efficiency and accuracy of downstream tasks such as speech recognition.

Usage

To integrate the VADChunker class into your project, follow these steps:

  1. Clone the Repository:
git clone https://github.com/your-username/audio-segmentation-vad.git
  1. Install the Dependencies:
pip install -r requirements.txt
  1. Import the VADChunker Class:
from vad_chunker import VADChunker
  1. Instantiate the VADChunker Class:
vad_chunker = VADChunker()
  1. Process Audio Bytes:
vad_chunker.input_chunk(audio_bytes)
segment = vad_chunker.output_chunk(min_audio_len=5)

see example.py for a complete example.

Contributing

If you find any issues or have ideas for improvements, please feel free to contribute!

License

This project is licensed under the MIT License. Feel free to use, modify, and distribute the code as per the terms of the license.

Acknowledgments

silero-vad for the VAD model.

Happy audio segmentation!

About

Audio Segmentation using Voice Activity Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages