Skip to content

Commit

Permalink
tune description of the repo wrt references
Browse files Browse the repository at this point in the history
  • Loading branch information
karpathy committed Nov 20, 2022
1 parent f61811b commit 988aa59
Showing 1 changed file with 6 additions and 5 deletions.
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,13 @@ makemore takes one text file as input, where each line is assumed to be one trai

This is not meant to be too heavyweight library with a billion switches and knobs. It is one hackable file, and is mostly intended for educational purposes. [PyTorch](https://pytorch.org) is the only requirement.

Current language model neural nets implemented:
Current implementation follows a few key papers:

- Bigram (one character simply predicts a next one with a lookup table of counts)
- Bag of Words
- MLP, along the lines of [Bengio et al. 2003](https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)
- RNN, along the lines of [Sutskever et al. 2011](https://icml.cc/2011/papers/524_icmlpaper.pdf)
- Bigram (one character predicts the next one with a lookup table of counts)
- MLP, following [Bengio et al. 2003](https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf)
- CNN, following [DeepMind WaveNet 2016](https://arxiv.org/abs/1609.03499) (in progress...)
- RNN, following [Mikolov et al. 2010](https://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf)
- LSTM, following [Graves et al. 2014](https://arxiv.org/abs/1308.0850)
- GRU, following [Kyunghyun Cho et al. 2014](https://arxiv.org/abs/1409.1259)
- Transformer, following [Vaswani et al. 2017](https://arxiv.org/abs/1706.03762)

Expand Down

0 comments on commit 988aa59

Please sign in to comment.