GitHub - ku-dmlab/fairseq: Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Training a new model

To run RL algorithm implemented in this code, you need to first train the base model.

IWSLT'14 German to English (Transformer)

The following instructions can be used to train a Transformer model on the IWSLT'14 German to English dataset.

First download and preprocess the data:

# Download and prepare the data
cd examples/translation/
bash prepare-iwslt14.sh
cd ../..

# Preprocess/binarize the data
TEXT=examples/translation/iwslt14.tokenized.de-en
fairseq-preprocess --source-lang de --target-lang en \
    --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
    --destdir data-bin/iwslt14.tokenized.de-en \
    --workers 20

Next we'll train a Transformer translation model over this data:

CUDA_VISIBLE_DEVICES=0 fairseq-train \
    data-bin/iwslt14.tokenized.de-en \
    --arch transformer_iwslt_de_en --share-decoder-input-output-embed \
    --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
    --lr 5e-4 --lr-scheduler inverse_sqrt --warmup-updates 4000 \
    --dropout 0.3 --weight-decay 0.0001 \
    --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
    --max-tokens 4096 \
    --eval-bleu \
    --eval-bleu-args '{"beam": 5, "max_len_a": 1.2, "max_len_b": 10}' \
    --eval-bleu-detok moses \
    --eval-bleu-remove-bpe \
    --eval-bleu-print-samples \
    --best-checkpoint-metric bleu --maximize-best-checkpoint-metric

Finally we can evaluate our trained model:

fairseq-generate data-bin/iwslt14.tokenized.de-en \
    --path checkpoints/checkpoint_best.pt \
    --batch-size 128 --beam 5 --remove-bpe

RL training:

Then, you need to fix the local checkpoint paths defined in train.py and generate.py. After that, simply run the code by running

python train.py
python generate.py

Name		Name	Last commit message	Last commit date
Latest commit History 2,070 Commits
.github		.github
docs		docs
examples		examples
fairseq		fairseq
fairseq_cli		fairseq_cli
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
build_mss_dataset.py		build_mss_dataset.py
build_offline_dataset.py		build_offline_dataset.py
build_random_dataset.py		build_random_dataset.py
coeff.pdf		coeff.pdf
coeff.png		coeff.png
generate.py		generate.py
generate2.py		generate2.py
hubconf.py		hubconf.py
hyp.pdf		hyp.pdf
hyp.png		hyp.png
mixed_plot.png		mixed_plot.png
plot.py		plot.py
plot_coeff.py		plot_coeff.py
plot_final.py		plot_final.py
plot_hyp.py		plot_hyp.py
plot_result2.py		plot_result2.py
plot_result3.py		plot_result3.py
preprocess.sh		preprocess.sh
preprocess2.sh		preprocess2.sh
preprocess_wmt.sh		preprocess_wmt.sh
pyproject.toml		pyproject.toml
run_experiment_post_edit.py		run_experiment_post_edit.py
setup.py		setup.py
train.py		train.py
train_offline.py		train_offline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Training a new model

IWSLT'14 German to English (Transformer)

RL training:

About

Releases

Packages

Languages

License

ku-dmlab/fairseq

Folders and files

Latest commit

History

Repository files navigation

Training a new model

IWSLT'14 German to English (Transformer)

RL training:

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages