forked from sebastianruder/NLP-progress
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding evaluation metrics, references, related work to Entity Linking…
…. Clarifying method description.
- Loading branch information
Johannes Hoffart
committed
Jul 26, 2018
1 parent
07fd929
commit 800738d
Showing
1 changed file
with
29 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,24 +1,44 @@ | ||
# Entity Linking | ||
|
||
Entity Linking (EL) can be defined in two approaches: | ||
* first approach (End-to-End): processing a piece of text to extract the entities (i.e. Named Entity Recognition) and then link these extracted entities to their counterpart in a given knowledge base (i.e. Wikipedia). | ||
* second approach: contrary to the first approach, this one directly takes the annotated entities as input and then has just to link them against their counterpart in a given knowledge base (i.e. Wikipedia). | ||
## Task | ||
|
||
Entity Linking (EL) is the task of recognizing (cf. [Named Entity Recognition](named_entity_recognition.md)) and disambiguating (Named Entity Disambiguation) named entities to a knowledge base (e.g. Wikidata, DBpedia, or YAGO). It is sometimes also simply known as Named Entity Recognition and Disambiguation. | ||
|
||
EL can be split into two classes of approaches: | ||
* *End-to-End*: processing a piece of text to extract the entities (i.e. Named Entity Recognition) and then disambiguate these extracted entities to the correct entry in a given knowledge base (e.g. Wikidata, DBpedia, YAGO). | ||
* *Disambiguation-Only*: contrary to the first approach, this one directly takes the annotated entities as input and then has just to disambiguate them to the correct entry in a given knowledge base. | ||
|
||
Example: | ||
|
||
| Barack | Obama | was | born | in | Hawaï | | ||
| --- | ---| --- | --- | --- | --- | | ||
| https://en.wikipedia.org/wiki/Barack_Obama | https://en.wikipedia.org/wiki/Barack_Obama | O | O | O | https://en.wikipedia.org/wiki/Hawaii | | ||
|
||
More in details in this [survey](http://dbgroup.cs.tsinghua.edu.cn/wangjy/papers/TKDE14-entitylinking.pdf) | ||
More in details can be found in this [survey](http://dbgroup.cs.tsinghua.edu.cn/wangjy/papers/TKDE14-entitylinking.pdf). | ||
|
||
## Evaluation | ||
|
||
### Metrics for Disambiguation-Only Approach | ||
|
||
* Micro-Precision: Fraction of correctly disambiguated named entities in the full corpus. | ||
* Macro-Precision: Fraction of correctly disambiguated named entities, averaged by document. | ||
|
||
## Datasets | ||
|
||
### AIDA CoNLL-YAGO Dataset | ||
|
||
The AIDA CoNLL-YAGO Dataset contains assignments of entities to the mentions of named entities annotated for the original [CoNLL 2003 NER task](http://www.aclweb.org/anthology/W03-0419.pdf). The entities are identified by [YAGO2](http://yago-knowledge.org/) entity name, by [Wikipedia URL](https://en.wikipedia.org/), or by [Freebase mid](http://wiki.freebase.com/wiki/Machine_ID). Approaches are evaluated based on span-based F1. | ||
The [AIDA CoNLL-YAGO](https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/aida/downloads/) Dataset [1] contains assignments of entities to the mentions of named entities annotated for the original [CoNLL 2003 NER task](http://www.aclweb.org/anthology/W03-0419.pdf) [2]. The entities are identified by [YAGO2](http://yago-knowledge.org/) entity identifier, by [Wikipedia URL](https://en.wikipedia.org/), or by [Freebase mid](http://wiki.freebase.com/wiki/Machine_ID). | ||
|
||
| Approach | F1 | Paper / Source | | ||
| Approach | Micro-Precision | Macro-Precision | Paper / Source | | ||
| ------------- | :-----:| --- | | ||
| Radhakrishnan et al. (2018) | 93.7 | [ELDEN: Improved Entity Linking using Densified Knowledge Graphs](http://aclweb.org/anthology/N18-1167) | | ||
| Le et al. (2018) | 93.07 | [Improving Entity Linking by Modeling Latent Relations between Mentions](https://arxiv.org/abs/1804.10637) | | ||
| Radhakrishnan et al. (2018) | 93.0 | 93.7 | [ELDEN: Improved Entity Linking using Densified Knowledge Graphs](http://aclweb.org/anthology/N18-1167) | | ||
| Le et al. (2018) | 93.07 | - | [Improving Entity Linking by Modeling Latent Relations between Mentions](https://arxiv.org/abs/1804.10637) | | ||
| Hoffart at al. (2011) | 82.29 | 82.02 | [Robust Disambiguation of Named Entities in Text](http://www.aclweb.org/anthology/D11-1072) | ||
|
||
## References | ||
|
||
[1] Johannes Hoffart, Mohamed Amir Yosef, Ilaria Bordino, Hagen Fürstenau, Manfred Pinkal, Marc Spaniol, Bilyana Taneva, Stefan Thater, and Gerhard Weikum. Robust Disambiguation of Named Entities in Text. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, Edinburgh, Scotland, pages 782–792, 2011. | ||
|
||
[2] Erik F Tjong Kim Sang and Fien De Meulder. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings of the 7th Conference on Natural Language Learning, CoNLL 2003, Edmonton, Canada, 2003. | ||
|
||
[Go back to the README](README.md) | ||
[Go back to the README](README.md) |