Stars
Repo associated to the DESED dataset, download and creation of data
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
A fast parallel implementation of RNN Transducer.
Notes on Machine Learning on edge for embedded/sensor/IoT uses
Learning associations between human faces and voices
Using LSTMs to exploit any temporal structure in noise.
DRNN with LSTM for monaural source separation
Collection of custom layers and utility functions for Keras which are missing in the main framework.
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.