Skip to content

Latest commit

 

History

History
9 lines (9 loc) · 1017 Bytes

tensorflow_rnn_tutorial.md

File metadata and controls

9 lines (9 loc) · 1017 Bytes
  • we’ll provide a short tutorial for training a RNN for speech recognition; we’re including code snippets throughout, and you can find the accompanying https://github.com/silicon-valley-data-science/RNN-Tutorial
  • if you are brand new to RNNs, we highly recommend you read Christopher Olah’s excellent overview of RNN Long Short-Term Memory (LSTM) networks http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • Speech recognition in the past and today both rely on decomposing sound waves into frequency and amplitude using fourier transforms, yielding a spectrogram
  • https://github.com/silicon-valley-data-science/RNN-Tutorial
  • Several key improvements that have been made by the Microsoft team and other researchers in the past 4 years include:
    • using language models on top of character based RNNs
    • using convolutional neural nets (CNNs) for extracting features from the audio
    • ensemble models that utilize multiple RNNs