Skip to content

Commit

Permalink
Add NarrativeQA results for entire books/sciprs mode (sebastianruder#598
Browse files Browse the repository at this point in the history
)
  • Loading branch information
urikz authored Jan 5, 2022
1 parent 4957110 commit 33e6800
Showing 1 changed file with 10 additions and 2 deletions.
12 changes: 10 additions & 2 deletions english/question_answering.md
Original file line number Diff line number Diff line change
Expand Up @@ -283,15 +283,23 @@ The public leaderboard is available on the [RecipeQA website](https://hucvl.gith


### NarrativeQA
[NarrativeQA](https://arxiv.org/abs/1712.07040) is a dataset built to encourage deeper comprehension of language. This dataset involves reasoning over reading entire books or movie scripts. This dataset contains approximately 45K question answer pairs in free form text. There are two modes of this dataset (1) reading comprehension over summaries and (2) reading comprehension over entire books/scripts.
[NarrativeQA](https://arxiv.org/abs/1712.07040) is a dataset built to encourage deeper comprehension of language. This dataset involves reasoning over reading entire books or movie scripts. This dataset contains approximately 45K question answer pairs in free form text. There are two modes of this dataset (1) reading comprehension over summaries and (2) reading comprehension over entire books/scripts.

The results for the first, summary mode are below.

| Model | BLEU-1 | BLEU-4 | METEOR | Rouge-L | Paper / Source | Code |
| ------------- | :-----: | :-----:|:-----:| :-----:|--- | --- |
|DecaProp (Tay et al., 2018) |44.35 |27.61 | 21.80 | 44.69 |[Densely Connected Attention Propagation for Reading Comprehension](https://arxiv.org/abs/1811.04210) | [official](https://github.com/vanzytay/NIPS2018_DECAPROP) |
|BiAttention + DCU-LSTM (Tay et al., 2018) |36.55 |19.79 | 17.87 | 41.44 |[Multi-Granular Sequence Encoding via Dilated Compositional Units for Reading Comprehension](http://aclweb.org/anthology/D18-1238) | |
|BiDAF (Seo et al., 2017) |33.45 |15.69 | 15.68 | 36.74 |[Bidirectional Attention Flow for Machine Comprehension](https://arxiv.org/abs/1611.01603) | |

*Note that the above is for the Summary setting. There are no official published results for reading over entire books/stories except for the original paper.
The results for the second mode (question answering over entire books or movie scripts) are below.

| Model | BLEU-1 | BLEU-4 | METEOR | Rouge-L | Paper / Source | Code |
| ------------- | :-----: | :-----:|:-----:| :-----:|--- | --- |
|Retriever + Reader (Izacard and Grave, 2020) |35.3 |7.5 | 11.1 | 32.0 |[Distilling Knowledge from Reader to Retriever for Question Answering](https://openreview.net/forum?id=NTEz-6wysdb) | [Official](https://github.com/facebookresearch/FiD) |
|Summary + Reader (UnifiedQA) (Wu et al., 2021) |21.82 |3.87 | 10.52 | 21.03 |[Recursively Summarizing Books with Human Feedback](https://arxiv.org/abs/2109.10862) | |
|ReadTwice (Zemlyanskiy et al., 2021) |21.1 |4.0 | 7.0 | 23.2 |[ReadTwice: Reading Very Large Documents with Memories](https://aclanthology.org/2021.naacl-main.408.pdf) | [Official](https://github.com/google-research/google-research/tree/master/readtwice) |

### DuoRC

Expand Down

0 comments on commit 33e6800

Please sign in to comment.