EasyProject

Frustratingly Easy Label Projection for Cross-lingual Transfer (Findings of ACL2023)

Gradio Demo

Checkpoints

Update (May 30, 2023): Update checkpoints due to an issue in Huggingface NLLB tokenization.

Data

Code

We use the code base and script adapted from MasakhaNER: Script

NER & EasyProject data

Google drive: link

NER (for evaluation): data_{masakahner,wikiann}
EasyProject (for training): output_nllb_3Bft_{wikiann,conll}

EasyProject post-processing script

We use the following script to perform post-processing for translation data. This step assign labels to entities inside the brackets (e.g., [ ]). The post processed data are stored in output_nllb_3Bft_{wikiann,conll}. The original data are stored in {conll,wikiann}_nllb_3B_ft.pkl files in the google drive.

Wikiann:

python decode_marker_wikiann.py

Masakhaner:

python decode_marker_conll.py

NER training

We use the following script with slurm to run experiments - please adjust accordingly.

Wikiann:

bash xlmr_en_marker_transfer_3bft.sh

Masakhaner:

bash mdeberta_en_marker_transfer_3bft.sh

Citation

Please cite if you use the above resources for your research

@inproceedings{chen2023easyproject,
  title={Frustratingly Easy Label Projection for Cross-lingual Transfer},
  author={Chen, Yang and Jiang, Chao and Ritter, Alan and Xu, Wei},
  booktitle={Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Findings)},
  year={2023}
}

Funding Acknowledgment

This material is based in part on research sponsored by IARPA via the BETTER program (2019-19051600004).

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
asset		asset
mt_training		mt_training
ner		ner
qa		qa
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EasyProject

Checkpoints

Data

Code

NER & EasyProject data

EasyProject post-processing script

NER training

Citation

Funding Acknowledgment

About

Releases

Packages

Languages

License

edchengg/easyproject

Folders and files

Latest commit

History

Repository files navigation

EasyProject

Checkpoints

Data

Code

NER & EasyProject data

EasyProject post-processing script

NER training

Citation

Funding Acknowledgment

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages