Skip to content

Latest commit

 

History

History

Language Translation - IBM word alignment

IBM word alignment

Goal

Your tasks for this assignment are to implement and train the IBM Model 1 for word alignment on a parallel corpus of movie subtitles and to find the best word alignment for a set of test sentence pairs.You need to train your model using the Expectation-Maximization algorithm.

Data

It contains two parallel corpora, eng-spa.txt (English-Spanish) and eng-ger.txt (English-German). In both cases, the target language for translation is English.