- KALDI speech recognition toolkit with many SOTA models.
- isolating instruments from stereo music using Convolutional Neural Networks, part 2
- Sound classification using cnn, loading and normalizing sounds using librosa, converting to a 2d spectrogram image, using cnn on top.
- speech recognition with DL - how to convert sounds to vectors, feeding into an RNN.
- (Great) Jonathan Hui on speech recognition - series.
Gecko - (github.com/gong-io/gecko) youtube, is an open-source tool for the annotation of the linguistic content of conversations. It can be used for segmentation, diarization, and transcription. With Gecko, you can create and perfect audio-based datasets, compare the results of multiple models simultaneously, and highlight differences between transcriptions.