The origin project is PreSumm. I modified codes of this project because the version of PyTorch is different and I suit my development environment.
- origin paper : Text Summarization with Pretrained Encoders (EMNLP, 2019)
- PreSumm PyTorch version : 1.1.0, My Pytorch version : 1.11.0
Results on CNN/DailyMail (2022.04.05):
Models | ROUGE-1 | ROUGE-2 | ROUGE-L |
---|---|---|---|
BertSumExt | 42.89 | 20.09 | 39.33 |
BertSumAbs | 41.23 | 18.86 | 38.27 |
Results on XSum (2022.04.07):
Models | ROUGE-1 | ROUGE-2 | ROUGE-L |
---|---|---|---|
BertSumExt | 21.74 | 4.27 | 16.99 |
Python version : This code is in Python 3.8
PyTorch version : This code is in PyTorch 1.11.0
The only difference from the origin project is the /src
. So, please make directories as follows.
/bert_data
/models
/logs
/results
The input document and gold summary are randomly picked from CNN/DailyMail testset. The sentences in red are extracted by the BertSumExt, and the sentences in green are similar to the generated summary of BertSumAbs. I find the summary of BertSumAbs tends to copy sentences from input documents. CNN/DailyMail dataset has extractive characteristics, so this phenomenon occurs when this model is fine-tuned with this dataset.