Stars
A list of pain recognition databases that are publicly available for research
Python interface to the WebRTC Voice Activity Detector
Sequence modeling benchmarks and temporal convolutional networks
Unsupervised Speech Decomposition Via Triple Information Bottleneck
COVID19 P2P Risk Prediction Model & Dataset
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
PyTorch toolbox for matrix product state models
This library provides common speech features for ASR including MFCCs and filterbank energies.
Python Script to download hundreds of images from 'Google Images'. It is a ready-to-run code!