This repo is forked form indonlu-repo with several adjusment and addition. This repo is implementation of "Improving Low-Resource Question Answering with Cross-Lingual Data Augmentation Strategies" paper (accepted on ICOICT 2022) (paper).
Check on requirment_file
- Clone This Repo
- Run Training Script
CUDA_VISIBLE_DEVICES=6 \
python3 main.py \
--n_epochs=25 \
--train_batch_size=8 \
--model_checkpoint=xlm-roberta-base \
--step_size=1 \
--gamma=0.9 \
--device=cuda \
--experiment_name=xlm-roberta-base-2step-indo-dataset-e3 \
--lr=1e-5 \
--early_stop=12 \
--dataset=qa-factoid-itb \
--lower \
--num_layers=12 \
--max_norm=10 \
--seed=42 \
--data_type=original \
--force
- Or you can test your own model with eval_only
CUDA_VISIBLE_DEVICES=6 \
python3 main.py \
--n_epochs=25 \
--train_batch_size=8 \
--model_checkpoint=./save/qa-factoid-itb/xlm-roberta-base-english-only-dataset-e3/xlm-roberta-pretrained \
--step_size=1 \
--gamma=0.9 \
--device=cuda \
--experiment_name=xlm-roberta-base-2step-indo-dataset-e3 \
--lr=1e-5 \
--early_stop=12 \
--dataset=qa-factoid-itb \
--lower \
--num_layers=12 \
--max_norm=10 \
--seed=42 \
--data_type=original \
--eval_only \
--force
You can submit a GitHub issue for asking a question or help. Or you can contact me directly at [email protected] as well