KM-BERT: A Pretrained BERT for Korean Medical Natural Language Processing.
KM-BERT has been trained on the Korean medical corpus collected from three types of text (medical textbook, health information news, medical research articles).
Please visit the original repo of BERT (Devlin, et al.) for more information about the pre-trained BERT.
And, please find the repos of KR-BERT (Lee, et al.) and KoBERT (SKTBrain) that are a pre-trained BERT for Korean language.
This repo includes two types of models.
KM-BERT.tar: Korean Medical BERT tar file.
KM-BERT.zip: Korean Medical BERT zip file.
KM-BERT-vocab.tar: Korean Medical BERT with additional medical vocabulary tar file.
KM-BERT-vocab.zip: Korean Medical BERT with additional medical vocabulary zip file.
##############################################################
Tip: Open the link in a "new window", then refresh a page.
##############################################################
Each file is composed of (config.json), (vocab.txt), and (model.bin).
python 3.6
pytorch 1.2.0
pytorch-pretrained-bert 0.6.2
Example:
python KMBERT_medsts.py --pretrained kmbert #MedSTS with KM-BERT
python KMBERT_medsts.py --pretrained kmbert_vocab #MedSTS with KM-BERT-vocab
python KMBERT_ner.py --pretrained kmbert #NER with KM-BERT
python KMBERT_ner.py --pretrained kmbert_vocab #NER with KM-BERT-vocab
Arguments:
--pretrained Model
@article{KMBERT,
title={KM-BERT: A Pre-trained BERT for Korean Medical Natural Language Processing},
author={TBD},
year={TBD},
journal={TBD},
volume={TBD}
}