From d46353737ba4303075afe7dc12140d90e1edbb2e Mon Sep 17 00:00:00 2001
From: Guillaume Lample <glample@cs.cmu.edu>
Date: Thu, 18 Jul 2019 08:04:10 -0700
Subject: [PATCH] updated README

---
 README.md | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index ab39fd8b..a9637a14 100644
--- a/README.md
+++ b/README.md
@@ -1,9 +1,11 @@
 # XLM
 
 PyTorch original implementation of [Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291). Includes:
-- Monolingual language model pretraining (BERT)
-- Cross-lingual language model pretraining (XLM)
-- Applications: GLUE, XNLI, Unsupervised and Supervised MT
+- [Monolingual language model pretraining (BERT)](#i-monolingual-language-model-pretraining-bert)
+- [Cross-lingual language model pretraining (XLM)](#ii-cross-lingual-language-model-pretraining-xlm)
+- [Applications: Supervised / Unsupervised MT experiments](#iii-applications-supervised--unsupervised-mt-experiments)
+- [Applications: Cross-lingual text classification (XNLI)](#iv-applications-cross-lingual-text-classification-xnli)
+- [Product-Key Memory Layers](#v-product-key-memory-layers)
 
 <br>
 <br>
@@ -302,6 +304,7 @@ Cross-lingual language model (XLM) provides a strong pretraining method for cros
 
 
 ## III. Applications: Supervised / Unsupervised MT experiments
+
 XLMs can be used as a pretraining method for unsupervised or supervised neural machine translation.
 
 ### Pretrained XLM(MLM) models
@@ -583,6 +586,11 @@ python glue-xnli.py
 --max_vocab 95000                        # max number of words in vocab
 ```
 
+## V. Product-Key Memory Layers
+
+XLM also implements the Product-Key Memory layer (PKM) described in [[4]](https://arxiv.org/abs/1907.05242). To add a memory in (for instance) the layers 4 and 7 of an encoder, you can simply provide `--use_memory true --mem_enc_positions 4,7` as argument of `train.py` (and similarly for `--mem_dec_positions` and the decoder). All memory layer parameters can be found [here](https://github.com/facebookresearch/XLM/blob/master/src/model/memory/memory.py#L225).
+
+
 ## Frequently Asked Questions
 
 ### How can I run experiments on multiple GPUs?
@@ -640,7 +648,19 @@ Please cite [[1]](https://arxiv.org/abs/1901.07291) if you found the resources i
 }
 ```
 
+### Large Memory Layers with Product Keys
+
+[4] G. Lample, A. Sablayrolles, MA. Ranzato, L. Denoyer, H. Jégou [*Large Memory Layers with Product Keys*](https://arxiv.org/abs/1907.05242)
+
+```
+@article{lample2019large,
+  title={Large Memory Layers with Product Keys},
+  author={Lample, Guillaume and Sablayrolles, Alexandre and Ranzato, Marc'Aurelio and Denoyer, Ludovic and J{\'e}gou, Herv{\'e}},
+  journal={arXiv preprint arXiv:1907.05242},
+  year={2019}
+}
+```
+
 ## License
 
 See the [LICENSE](LICENSE) file for more details.
-