Skip to content

binayachaudari/Nepali-Tamang-MT-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Published Paper

Efforts Towards Developing a Tamang Nepali Machine Translation System (Chaudhary et al., ICON 2020)

Citation

Please use this bibtex if you want to cite this repository:

@inproceedings{chaudhary-etal-2020-efforts,
    title = "Efforts Towards Developing a {T}amang {N}epali Machine Translation System",
    author = "Chaudhary, Binaya Kumar  and
      Bal, Bal Krishna  and
      Baidar, Rasil",
    editor = "Bhattacharyya, Pushpak  and
      Sharma, Dipti Misra  and
      Sangal, Rajeev",
    booktitle = "Proceedings of the 17th International Conference on Natural Language Processing (ICON)",
    month = dec,
    year = "2020",
    address = "Indian Institute of Technology Patna, Patna, India",
    publisher = "NLP Association of India (NLPAI)",
    url = "https://aclanthology.org/2020.icon-main.37/",
    pages = "281--286",
    abstract = "The Tamang language is spoken mainly in Nepal, Sikkim, West Bengal, some parts of Assam, and the North East region of India. As per the 2011 census conducted by the Nepal Government, there are about 1.35 million Tamang speakers in Nepal itself. In this regard, a Machine Translation System for Tamang-Nepali language pair is significant both from research and practical outcomes in terms of enabling communication between the Tamang and the Nepali communities. In this work, we train the Transformer Neural Machine Translation (NMT) architecture with attention using a small hand-labeled or aligned Tamang-Nepali corpus (15K sentence pairs). Our preliminary results show BLEU scores of 27.74 for the Nepali{\textrightarrow}Tamang direction and 23.74 in the Tamang{\textrightarrow}Nepali direction. We are currently working on increasing the datasets as well as improving the model to obtain better BLEU scores. We also plan to extend the work to add the English language to the model, thus making it a trilingual Machine Translation System for Tamang-Nepali-English languages."
}

Please also acknowledge Information and Language Processing Research Lab

Releases

No releases published

Packages

No packages published