Skip to content

Commit

Permalink
added metrics
Browse files Browse the repository at this point in the history
  • Loading branch information
graviraja committed Aug 7, 2020
1 parent 47e7e86 commit 61ae290
Show file tree
Hide file tree
Showing 4 changed files with 18 additions and 1 deletion.
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -470,6 +470,13 @@ Therefore, our sequence tagging model uses both

![ner](assets/images/applications/classification/char_bilstm_ner.png)

### Day 83: Evaluation metrics for NER tagging

Micro and macro-averages (for whatever metric) will compute slightly different things, and thus their interpretation differs. A macro-average will compute the metric independently for each class and then take the average (hence treating all classes equally), whereas a micro-average will aggregate the contributions of all classes to compute the average metric. In a multi-class classification setup, micro-average is preferable if you suspect there might be class imbalance (i.e you may have many more examples of one class than of other classes).

![ner](assets/images/applications/classification/bilstm_crf_res.png)

![ner](assets/images/applications/classification/char_bilstm_crf_res.png)

Checkout the code in `applications/classification` folder

Expand Down
12 changes: 11 additions & 1 deletion applications/classification/ner_tagging/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,20 @@ Since we're using CRFs, we're not so much predicting the right label at each wor

![ner](../../../assets/images/applications/classification/viterbi.png)

Results:

Micro and macro-averages (for whatever metric) will compute slightly different things, and thus their interpretation differs. A macro-average will compute the metric independently for each class and then take the average (hence treating all classes equally), whereas a micro-average will aggregate the contributions of all classes to compute the average metric. In a multi-class classification setup, micro-average is preferable if you suspect there might be class imbalance (i.e you may have many more examples of one class than of other classes).

![ner](../../../assets/images/applications/classification/bilstm_crf_res.png)

#### Resources

- [Medium Blog post on CRF (Must read)](https://towardsdatascience.com/implementing-a-linear-chain-conditional-random-field-crf-in-pytorch-16b0b9c4b4ea)
- [BiLSTM - CRF model paper](https://arxiv.org/pdf/1508.01991.pdf)
- [CRF Video Explanation](https://www.youtube.com/watch?v=GF3iSJkgPbA)
- [code reference](https://github.com/Gxzzz/BiLSTM-CRF)
- [Vitebri decoding](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Sequence-Labeling#viterbi-decoding)
- [Metric explanation](https://datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance-in-a-multiclass-classification-settin)

## NER tagging with Char-BiLSTM-CRF.ipynb

Expand All @@ -72,5 +79,8 @@ Therefore, our sequence tagging model uses both
- `word-level` information in the form of word embeddings.
- `character-level` information up to and including each word in both directions.

![ner](../../../assets/images/applications/classification/char_bilstm_ner.png)

Results:

![ner](../../../assets/images/applications/classification/char_bilstm_ner.png)
![ner](../../../assets/images/applications/classification/char_bilstm_crf_res.png)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 61ae290

Please sign in to comment.