Skip to content

Latest commit

 

History

History

scripts

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

This example produces the preprocessed Europarl English corpus that can be then used for training a model.

Requires nltk

Usage example: ./run.sh

cd ..

python data.py ./example/out

python main.py ep 256 0.02

python play_with_model.py Model_ep_h256_lr0.02.pcl

The input text to play_with_model.py should be similar to the contents of the preprocessed files in ./example/out (i.e. lowercased, numeric tokens replaced with ), but should not contain punctuation tokens.

Training time on this dataset with a Nvidia Tesla K20 GPU was about 15 hours (~3500 samples per second)