From d46353737ba4303075afe7dc12140d90e1edbb2e Mon Sep 17 00:00:00 2001 From: Guillaume Lample Date: Thu, 18 Jul 2019 08:04:10 -0700 Subject: [PATCH] updated README --- README.md | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index ab39fd8b..a9637a14 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,11 @@ # XLM PyTorch original implementation of [Cross-lingual Language Model Pretraining](https://arxiv.org/abs/1901.07291). Includes: -- Monolingual language model pretraining (BERT) -- Cross-lingual language model pretraining (XLM) -- Applications: GLUE, XNLI, Unsupervised and Supervised MT +- [Monolingual language model pretraining (BERT)](#i-monolingual-language-model-pretraining-bert) +- [Cross-lingual language model pretraining (XLM)](#ii-cross-lingual-language-model-pretraining-xlm) +- [Applications: Supervised / Unsupervised MT experiments](#iii-applications-supervised--unsupervised-mt-experiments) +- [Applications: Cross-lingual text classification (XNLI)](#iv-applications-cross-lingual-text-classification-xnli) +- [Product-Key Memory Layers](#v-product-key-memory-layers)

@@ -302,6 +304,7 @@ Cross-lingual language model (XLM) provides a strong pretraining method for cros ## III. Applications: Supervised / Unsupervised MT experiments + XLMs can be used as a pretraining method for unsupervised or supervised neural machine translation. ### Pretrained XLM(MLM) models @@ -583,6 +586,11 @@ python glue-xnli.py --max_vocab 95000 # max number of words in vocab ``` +## V. Product-Key Memory Layers + +XLM also implements the Product-Key Memory layer (PKM) described in [[4]](https://arxiv.org/abs/1907.05242). To add a memory in (for instance) the layers 4 and 7 of an encoder, you can simply provide `--use_memory true --mem_enc_positions 4,7` as argument of `train.py` (and similarly for `--mem_dec_positions` and the decoder). All memory layer parameters can be found [here](https://github.com/facebookresearch/XLM/blob/master/src/model/memory/memory.py#L225). + + ## Frequently Asked Questions ### How can I run experiments on multiple GPUs? @@ -640,7 +648,19 @@ Please cite [[1]](https://arxiv.org/abs/1901.07291) if you found the resources i } ``` +### Large Memory Layers with Product Keys + +[4] G. Lample, A. Sablayrolles, MA. Ranzato, L. Denoyer, H. Jégou [*Large Memory Layers with Product Keys*](https://arxiv.org/abs/1907.05242) + +``` +@article{lample2019large, + title={Large Memory Layers with Product Keys}, + author={Lample, Guillaume and Sablayrolles, Alexandre and Ranzato, Marc'Aurelio and Denoyer, Ludovic and J{\'e}gou, Herv{\'e}}, + journal={arXiv preprint arXiv:1907.05242}, + year={2019} +} +``` + ## License See the [LICENSE](LICENSE) file for more details. -