The datasets are as described and used in Fill in the BLANC: Human-free quality estimation of document summaries and in Sensitivity of BLANC to human-scored qualities of text summaries. The human scores to summaries were assigned by 10 annotators from Odetta.ai, provided with detailed instructions and trained on trial tasks. Each dataset preserves Ids of the annotators with their individual scores.
The datasets:
- CNN_DailyMail_555: 555 text-summary pairs, with 100 texts with human summaries taken randomly from the CNN / Daily Mail dataset Hermann et al., 2015, and complemented with generated summaries. The single human score describes generic quality of the summary.
- DailyNews_300: 300 text-summary pairs, created from 100 texts taken randomly from daily news of different sources. Three summaries for each text were generated by extractive, abstractive and semi-abstractive models. The single human score describes generic quality of the summary.
- DailyNews_300_aspects: The same 300 text-summary pairs as above, but assigned 5 human quality scores, accordingly to how fluent, understandable, informative, compact and overall-good the summary is.