Second stage of the Bias in NMT project. Objectives:
- NMT Engines with BPE: 50k operations, separate vocabularies.
- SMT Engines without copying unknown source words into the target.
- Lexical richness
- Frequency and frequency classes based on Zipf's law
- Synomyms
Collaboratiors:
- Dimitar Shterionov
- Eva Vanmassenhove
- Matt Gwilliam
Version 1.0:
- Europarl EN-FR, FR-EN, EN-ES and ES-EN NMT engines with BPE trained.
- Training (seen) data translated.
- Backtranslated engines also trained.
- Test data translated.
Included:
- Train and test translations
-
- TEST_SET REF: references, tokenized (no escape characters)