CRF-BiLSTM NER

We use Python3.6+tensorflow1.12.0 to coding the bi-LSTM+CRF network structure and realizing the sequence tagging for chinese characters.

(1)Raw data preprocessing. Calling vocab_build() function to convert the data of .txt format to .pkl format. And initilazing the word vectors.(We set the dimension of vectors is 300)

(2)Designing the network structure and hyper-parameters, calling main.py train and test model.

batch_size=64;
epoch=25;
learning_rate=0.001;
dropout=0.5;
gradient_clipping=5.0;
LSTM_num(forward)=300;
LSTM_num(backward)=300；
optimizer=Adam;
...

After 25 epochs, the precision, recall, F1 values of test set as shoun below.

Fig. precision, recall, F1 values

precision=0.951266;
recall=0.908613;
f1=0.929270. (After 25 iterations)

If you want to know more about the LSTM+CRF model, please to read sequence labelling.md or click here.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
data_save_word2vec_bi_lstf_crf/1542206032		data_save_word2vec_bi_lstf_crf/1542206032
introduction		introduction
word2vec_lstm_crf		word2vec_lstm_crf
README.md		README.md
results.png		results.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CRF-BiLSTM NER

About

Releases

Packages

Languages

PrideLee/CRF-bi-LSTM-sequence-tagging-Chinese-characters-

Folders and files

Latest commit

History

Repository files navigation

CRF-BiLSTM NER

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages