This is a Pytorch port of OpenNMT, an open-source (MIT) neural machine translation system.
OpenNMT consists of three commands:
- Download the data.
wget https://s3.amazonaws.com/pytorch/examples/opennmt/data/onmt-data.tar && tar -xf onmt-data.tar
- Preprocess the data.
python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/demo
- Train the model.
python train.py -data data/demo-train.pt -save_model model -cuda
- Translate sentences.
python translate.py -cuda -model model_e13_*.pt -src data/src-test.txt -tgt data/tgt-test.txt -replace_unk -verbose
The following pretrained models can be downloaded and used with translate.py.
- onmt_model_en_de_200k: An English-German translation model based on the 200k sentence dataset at OpenNMT/IntegrationTesting. Perplexity: 21.
- onmt_model_en_fr_b1M: An English-French model trained on benchmark-1M. Perplexity: 4.85.
The following OpenNMT features are implemented:
- multi-layer bidirectional RNNs with attention and dropout
- data preprocessing
- saving and loading from checkpoints
- inference (translation) with batching and beam search
Not yet implemented:
- word features
- multi-GPU
- residual connections
With default parameters on a single Maxwell GPU, this version runs about 70% faster than the Lua torch OpenNMT. The improved performance comes from two main sources:
- CuDNN is used for the encoder (although not for the decoder, since it can't handle attention)
- The decoder softmax layer is batched to efficiently trade off CPU vs. memory efficiency; this can be tuned with the -max_generator_batches parameter.