Skip to content

Latest commit

 

History

History
12 lines (8 loc) · 1.66 KB

deep-neural-audio.md

File metadata and controls

12 lines (8 loc) · 1.66 KB

Deep Neural Audio

  1. KALDI speech recognition toolkit with many SOTA models.
  2. isolating instruments from stereo music using Convolutional Neural Networks, part 2
  3. Sound classification using cnn, loading and normalizing sounds using librosa, converting to a 2d spectrogram image, using cnn on top.
  4. speech recognition with DL - how to convert sounds to vectors, feeding into an RNN.
  5. (Great) Jonathan Hui on speech recognition - series.

Tools

Gecko - (github.com/gong-io/gecko) youtube, is an open-source tool for the annotation of the linguistic content of conversations. It can be used for segmentation, diarization, and transcription. With Gecko, you can create and perfect audio-based datasets, compare the results of multiple models simultaneously, and highlight differences between transcriptions.