Bidirectional Decoding for Neural Machine Translation

This repository is built upon the OpenNMT-py code. The basic usage of this repository please go to check its document.

The intuition of this project is that because of the property of the autoregressive decoding process of the sequential decoder, it normally translate in one direction. In this way, we think this kind of model have not make enough usage of bidirectional information of target language. So we proposed two ways to explore this kind of bidiretional information.

Multi-task Learning

The first way we explore is Multi-task learning (MTL) way. In this method, we take forward and backward decoding as two tasks.

And later we jointly train them with sharing some components. Here, we share same encoder defaultly. We mainly share three components: attention, embedding, and generator.

Then we can try to share single component or multiple component. Such as:

Or share multiple:

In the training phase, we assume the shared component learned backward information. So in the test phase, we throw the backward decoding component except shared componets, and predict target with forward decoding.

Result show this model get improvement on WMT DE-EN task (on the full data we get +0.98 near 1 BLEU score improvement than base model) and ZH-EN task (only test on new commentary data because of limited resource, with +0.95 BLEU score improvement).

Regularization

This idea is quite simple. We enforce the forward and backward decoding RNN hidden states in same time step to close each other by regularization.

Regulatization here can be various. We use two ways here. First, we just use L2 regularization directly. But this is too strict. Second, to add more flexibilty, we add two linear layer to both hidden states before do L2 regularization.

Quickstart

You can enable above train options as below:

-share_atten: Expecify sharing attention component.
-share_embed: Expecify sharing word embedding component.
-share_gen: Expecify sharing generator component.
-l2_reg: Expecify L2 regularization, you choose between three options. none,direct, and affine.

Name		Name	Last commit message	Last commit date
Latest commit History 1,464 Commits
data		data
docs		docs
onmt		onmt
test		test
tools		tools
.gitignore		.gitignore
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
github_deploy_key_opennmt_opennmt_py.enc		github_deploy_key_opennmt_opennmt_py.enc
opts.py		opts.py
preprocess.py		preprocess.py
requirements.opt.txt		requirements.opt.txt
requirements.txt		requirements.txt
setup.py		setup.py
tmp.py		tmp.py
train.py		train.py
train_biLSTM.py		train_biLSTM.py
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bidirectional Decoding for Neural Machine Translation

Multi-task Learning

Regularization

Quickstart

About

Releases

Packages

Languages

License

andy-yangz/bidirectional-decoder-NMT

Folders and files

Latest commit

History

Repository files navigation

Bidirectional Decoding for Neural Machine Translation

Multi-task Learning

Regularization

Quickstart

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages