Skip to content

Commit

Permalink
Sorted tables, moved temporal IE and timex to temporal processing
Browse files Browse the repository at this point in the history
  • Loading branch information
sebastianruder committed Jul 11, 2018
1 parent dcaef6a commit c07ab8c
Show file tree
Hide file tree
Showing 8 changed files with 116 additions and 129 deletions.
8 changes: 3 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- [Dialog](dialog.md)
- [Domain adaptation](domain_adaptation.md)
- [Entity Linking](entity_linking.md)
- [Information Extraction](information_extraction.md)
- [Language modelling](language_modeling.md)
- [Machine translation](machine_translation.md)
- [Multi-task learning](multi-task_learning.md)
Expand All @@ -22,13 +23,10 @@
- [Sentiment analysis](sentiment_analysis.md)
- [Semantic parsing](semantic_parsing.md)
- [Semantic role labeling](semantic_role_labeling.md)
- [Stance detection](stance.md)
- [Stance detection](stance_detection.md)
- [Summarization](summarization.md)
- [Temporal expression normalization](timenorm.md)
- [Temporal information extraction](time.md)
- [Text classification](text_classification.md)
- [Information Extraction](information_extraction.md)
- [Temporal Processing](temporal_processing.md)
- [Text classification](text_classification.md)

This document aims to track the progress in Natural Language Processing (NLP) and give an overview
of the state-of-the-art across the most common NLP tasks and their corresponding datasets.
Expand Down
7 changes: 4 additions & 3 deletions information_extraction.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,23 @@
# Information Extraction

## Open Knowledge Graph Canonicalization
#### Problem

Open Information Extraction approaches leads to creation of large Knowledge bases (KB) from the web. The problem with such methods is that their entities and relations are not canonicalized, which leads to storage of redundant and ambiguous facts. For example, an Open KB storing *\<Barack Obama, was born in, Honolulu\>* and *\<Obama, took birth in, Honolulu\>* doesn't know that *Barack Obama* and *Obama* mean the same entity. Similarly, *took birth in* and *was born in* also refer to the same relation. Problem of Open KB canonicalization involves identifying groups of equivalent entities and relations in the KB.

#### Datasets
### Datasets

| Datasets | # Gold Entities | #NPs | #Relations | #Triples |
| ---------------------------------------- | :-------------: | ----- | ---------- | -------- |
| [Base](https://suchanek.name/work/publications/cikm2014.pdf) | 150 | 290 | 3K | 9K |
| [Ambiguous](https://suchanek.name/work/publications/cikm2014.pdf) | 446 | 717 | 11K | 37K |
| [ReVerb45K](https://github.com/malllabiisc/cesi) | 7.5K | 15.5K | 22K | 45K |

#### Noun Phrase Canonicalization:
### Noun Phrase Canonicalization

| **Model** | | Base Dataset | | | Ambiguous dataset | | | ReVerb45k | | **Paper**/Source |
| :---------------------------- | :-----------: | :----------: | :----: | :-----------: | :---------------: | ------ | :-----------: | :--------: | :----: | ---------------------------------------- |
| | **Precision** | **Recall** | **F1** | **Precision** | **Recall** | **F1** | **Precision** | **Recall** | **F1** | |
| CESI (Vashishth et al., 2018) | 98.2 | 99.8 | 99.9 | 66.2 | 92.4 | 91.9 | 62.7 | 84.4 | 81.9 | [CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information](https://github.com/malllabiisc/cesi) |
| Galárraga et al., 2014 ( IDF) | 94.8 | 97.9 | 98.3 | 67.9 | 82.9 | 79.3 | 71.6 | 50.8 | 0.5 | [Canonicalizing Open Knowledge Bases](https://suchanek.name/work/publications/cikm2014.pdf) |

[Go back to the README](README.md)
7 changes: 1 addition & 6 deletions named_entity_recognition.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,6 @@ The [WNUT 2017 Emerging Entities task](http://aclweb.org/anthology/W17-4418) ope
text and focuses on generalisation beyond memorisation in high-variance environments. Scores are given both over
entity chunk instances, and unique entity surface forms, to normalise the biasing impact of entities that occur frequently.

#### Dataset

| Feature | Train | Dev | Test |
| --- | --- | --- | --- |
| Posts | 3,395 | 1,009 | 1,287 |
Expand All @@ -42,12 +40,9 @@ The data is annotated for six classes - person, location, group, creative work,

Links: [WNUT 2017 Emerging Entity task page](https://noisy-text.github.io/2017/emerging-rare-entities.html) (including direct download links for data and scoring script)

#### State-of-the-art

| Model | F1 | F1 (surface form) | Paper / Source |
| --- | --- | --- | --- |
| SpinningBytes | 40.78 | 39.33 | [Transfer Learning and Sentence Level Features for Named Entity Recognition on Tweets](http://aclweb.org/anthology/W17-4422.pdf) |
| Aguilar et al. (2018) | 45.55 | | [Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media](http://aclweb.org/anthology/N18-1127.pdf) |

| SpinningBytes | 40.78 | 39.33 | [Transfer Learning and Sentence Level Features for Named Entity Recognition on Tweets](http://aclweb.org/anthology/W17-4422.pdf) |

[Go back to the README](README.md)
26 changes: 14 additions & 12 deletions part-of-speech_tagging.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,6 @@ Example:
| --- | ---| --- | --- | --- |
| NNP | , | CD | NNS | JJ |

### UD

[Universal Dependencies (UD)](http://universaldependencies.org/) is a framework for
cross-linguistic grammatical annotation, which contains more than 100 treebanks in over 60 languages.
Models are typically evaluated based on the average test accuracy across 28 languages.

| Model | Avg accuracy | Paper / Source |
| ------------- | :-----:| --- |
| Adversarial Bi-LSTM (Yasunaga et al., 2018) | 96.73 | [Robust Multilingual Part-of-Speech Tagging via Adversarial Training](https://arxiv.org/abs/1711.04903) |
| Bi-LSTM (Plank et al., 2016) | 96.40 | [Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss](https://arxiv.org/abs/1604.05529) |
| Joint Bi-LSTM (Nguyen et al., 2017) | 95.55 | [A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing](https://arxiv.org/abs/1705.05952) |

### Penn Treebank

A standard dataset for POS tagging is the Wall Street Journal (WSJ) portion of the Penn Treebank, containing 45
Expand All @@ -39,6 +27,7 @@ different POS tags. Sections 0-18 are used for training, sections 19-21 for deve
| Bi-LSTM (Ling et al., 2017) | 97.36 | [Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation](https://www.aclweb.org/anthology/D/D15/D15-1176.pdf) | |
| Bi-LSTM (Plank et al., 2016) | 97.22 | [Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss](https://arxiv.org/abs/1604.05529) |


### Social media

The [Ritter (2011)](https://aclanthology.coli.uni-saarland.de/papers/D11-1141/d11-1141) dataset has become the benchmark for social media part-of-speech tagging. This is comprised of some 50K tokens of English social media sampled in late 2011, and is tagged using an extended version of the PTB tagset.
Expand All @@ -48,4 +37,17 @@ The [Ritter (2011)](https://aclanthology.coli.uni-saarland.de/papers/D11-1141/d1
| GATE | 88.69 | [Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data](https://aclanthology.coli.uni-saarland.de/papers/R13-1026/r13-1026) |
| CMU | 90.0 ± 0.5 | [Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters](http://www.cs.cmu.edu/~ark/TweetNLP/owoputi+etal.naacl13.pdf) |


### UD

[Universal Dependencies (UD)](http://universaldependencies.org/) is a framework for
cross-linguistic grammatical annotation, which contains more than 100 treebanks in over 60 languages.
Models are typically evaluated based on the average test accuracy across 28 languages.

| Model | Avg accuracy | Paper / Source |
| ------------- | :-----:| --- |
| Adversarial Bi-LSTM (Yasunaga et al., 2018) | 96.73 | [Robust Multilingual Part-of-Speech Tagging via Adversarial Training](https://arxiv.org/abs/1711.04903) |
| Bi-LSTM (Plank et al., 2016) | 96.40 | [Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss](https://arxiv.org/abs/1604.05529) |
| Joint Bi-LSTM (Nguyen et al., 2017) | 95.55 | [A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing](https://arxiv.org/abs/1705.05952) |

[Go back to the README](README.md)
4 changes: 1 addition & 3 deletions stance_detection.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,4 @@ This dataset subsumes the large [PHEME collection of rumors and stance](http://j
| Bahuleyan and Vechtomova 2017| 0.780 | [UWaterloo at SemEval-2017 Task 8: Detecting Stance towards Rumours with Topic Independent Features](http://www.aclweb.org/anthology/S/S17/S17-2080.pdf) |
|



[Go back to the README](README.md)
[Go back to the README](README.md)
96 changes: 93 additions & 3 deletions temporal_processing.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,115 @@
# Temporal Processing

## Document Dating (Time-stamping)
#### Problem

Document Dating is the problem of automatically predicting the date of a document based on its content. Date of a document, also referred to as the Document Creation Time (DCT), is at the core of many important tasks, such as, information retrieval, temporal reasoning, text summarization, event detection, and analysis of historical text, among others.

For example, in the following document, the correct creation year is 1999. This can be inferred by the presence of terms *1995* and *Four years after*.

*Swiss adopted that form of taxation in 1995. The concession was approved by the govt last September. Four years after, the IOC….*

#### Datasets
### Datasets

| Datasets | # Docs | Start Year | End Year |
| :--------------------------------------: | :----: | :--------: | :------: |
| [APW](https://drive.google.com/file/d/1tll04ZBooB3Mohm6It-v8MBcjMCC3Y1w/view) | 675k | 1995 | 2010 |
| [NYT](https://drive.google.com/file/d/1wqQRFeA1ESAOJqrwUNakfa77n_S9cmBi/view?usp=sharing) | 647k | 1987 | 1996 |

#### Comparison on year level granularity:
### Comparison on year level granularity:

| | APW Dataset | NYT Dataset | Paper/Source |
| -------------------------------------- | :---------: | :---------: | ---------------------------------------- |
| NeuralDater (Vashishth et. al, 2018) | 64.1 | 58.9 | [Document Dating using Graph Convolution Networks](https://github.com/malllabiisc/NeuralDater) |
| Chambers (2012) | 52.5 | 42.3 | [Labeling Documents with Timestamps: Learning from their Time Expressions](https://pdfs.semanticscholar.org/87af/a0cb4f829ce861da0c721ca666d48a62c404.pdf) |
| BurstySimDater (Kotsakos et. al, 2014) | 45.9 | 38.5 | [A Burstiness-aware Approach for Document Dating](https://www.idi.ntnu.no/~noervaag/papers/SIGIR2014short.pdf) |


## Temporal Information Extraction

Temporal information extraction is the identification of chunks/tokens corresponding to temporal intervals, and the extraction and determination of the temporal relations between those. The entities extracted may be temporal expressions (timexes), eventualities (events), or auxiliary signals that support the interpretation of an entity or relation. Relations may be temporal links (tlinks), describing the order of events and times, or subordinate links (slinks) describing modality and other subordinative activity, or aspectual links (alinks) around the various influences aspectuality has on event structure.

The markup scheme used for temporal information extraction is well-described in the ISO-TimeML standard, and also on [www.timeml.org](http://www.timeml.org).

```
<?xml version="1.0" ?>
<TimeML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://timeml.org/timeMLdocs/TimeML_1.2.1.xsd">
<TEXT>
PRI20001020.2000.0127
NEWS STORY
<TIMEX3 tid="t0" type="TIME" value="2000-10-20T20:02:07.85">10/20/2000 20:02:07.85</TIMEX3>
The Navy has changed its account of the attack on the USS Cole in Yemen.
Officials <TIMEX3 tid="t1" type="DATE" value="PRESENT_REF" temporalFunction="true" anchorTimeID="t0">now</TIMEX3> say the ship was hit <TIMEX3 tid="t2" type="DURATION" value="PT2H">nearly two hours </TIMEX3>after it had docked.
Initially the Navy said the explosion occurred while several boats were helping
the ship to tie up. The change raises new questions about how the attackers
were able to get past the Navy security.
<TIMEX3 tid="t3" type="TIME" value="2000-10-20T20:02:28.05">10/20/2000 20:02:28.05</TIMEX3>
<TLINK timeID="t2" relatedToTime="t0" relType="BEFORE"/>
</TEXT>
</TimeML>
```

To avoid leaking knowledge about temporal structure, train, dev and test splits must be made at document level for temporal information extraction.

### TimeBank

TimeBank, based on the TIMEX3 standard embedded in ISO-TimeML, is a benchmark corpus containing 64K tokens of English newswire, and annotated for all asepcts of ISO-TimeML - including temporal expressions. TimeBank is freely distributed by the LDC: [TimeBank 1.2](https://catalog.ldc.upenn.edu/LDC2006T08)

Evaluation is for both entity chunking and attribute annotation, as well as temporal relation accuracy, typically measured with F1 -- although this metric is not sensitive to inconsistencies or free wins from interval logic induction over the whole set.

| Model | F1 score | Paper / Source |
| ------------- | :-----:| --- |
| Catena | 0.511 | [CATENA: CAusal and TEmporal relation extraction from NAtural language texts](http://www.aclweb.org/anthology/C16-1007) |
| CAEVO | 0.507 | [Dense Event Ordering with a Multi-Pass Architecture](https://www.transacl.org/ojs/index.php/tacl/article/download/255/50) |

### TempEval-3

The TempEval-3 corpus accompanied the shared [TempEval-3](http://www.aclweb.org/anthology/S13-2001) SemEval task in 2013. This uses a timelines-based metric to assess temporal relation structure. The corpus is fresh and somewhat more varied than TimeBank, though markedly smaller. [TempEval-3 data](https://www.cs.york.ac.uk/semeval-2013/task1/index.php%3Fid=data.html)

| Model | Temporal awareness | Paper / Source |
| ------------- | :-----:| --- |
| Ning et al. | 67.2 | [A Structured Learning Approach to Temporal Relation Extraction](http://www.aclweb.org/anthology/D17-1108) |
| ClearTK | 30.98 | [Cleartk-timeml: A minimalist approach to tempeval 2013](http://www.aclweb.org/anthology/S13-2002) |

## Timex normalisation

Temporal expression normalisation is the grounding of a lexicalisation of a time to a calendar date or other formal temporal representation.

Example:
<TIMEX3 tid="t0" type="TIME" value="2000-10-18T21:01:00.65">10/18/2000 21:01:00.65</TIMEX3>
Dozens of Palestinians were wounded in
scattered clashes in the West Bank and Gaza Strip, <TIMEX3 tid="t1" type="DATE" value="2000-10-18" temporalFunction="true" anchorTimeID="t0">Wednesday</TIMEX3>,
despite the Sharm el-Sheikh truce accord.

Chuck Rich reports on entertainment <TIMEX3 tid="t11" type="SET" value="XXXX-WXX-7">every Saturday</TIMEX3>

### TimeBank

TimeBank, based on the TIMEX3 standard embedded in ISO-TimeML, is a benchmark corpus containing 64K tokens of English newswire, and annotated for all asepcts of ISO-TimeML - including temporal expressions. TimeBank is freely distributed by the LDC: [TimeBank 1.2](https://catalog.ldc.upenn.edu/LDC2006T08)

| Model | F1 score | Paper / Source |
| ------------- | :-----:| --- |
| TIMEN | 0.89 | [TIMEN: An Open Temporal Expression Normalisation Resource](http://aclweb.org/anthology/L12-1015) |
| HeidelTime | 0.876 | [A baseline temporal tagger for all languages](http://aclweb.org/anthology/D15-1063) |

### PNT

The [Parsing Time Normalizations corpus](https://github.com/bethard/anafora-annotations/releases) in [SCATE](http://www.lrec-conf.org/proceedings/lrec2016/pdf/288_Paper.pdf) format allows the representation of a wider variety of time expressions than previous approaches. This corpus was release with [SemEval 2018 Task 6](http://aclweb.org/anthology/S18-1011).

| Model | F1 score | Paper / Source |
| ------------- | :-----:| --- |
| Laparra et al. 2018 | 0.764 | [From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations](http://aclweb.org/anthology/Q18-1025) |
| HeidelTime | 0.74 | [A baseline temporal tagger for all languages](http://aclweb.org/anthology/D15-1063) |
| Chrono | 0.70 | [Chrono at SemEval-2018 task 6: A system for normalizing temporal expressions](http://aclweb.org/anthology/S18-1012) |


[Go back to the README](README.md)
Loading

0 comments on commit c07ab8c

Please sign in to comment.