PreSumm_Jeewoo

The origin project is PreSumm. I modified codes of this project because the version of PyTorch is different and I suit my development environment.

origin paper : Text Summarization with Pretrained Encoders (EMNLP, 2019)
PreSumm PyTorch version : 1.1.0, My Pytorch version : 1.11.0

Results on CNN/DailyMail (2022.04.05):

Models	ROUGE-1	ROUGE-2	ROUGE-L
BertSumExt	42.89	20.09	39.33
BertSumAbs	41.23	18.86	38.27

Results on XSum (2022.04.07):

Models	ROUGE-1	ROUGE-2	ROUGE-L
BertSumExt	21.74	4.27	16.99

Python version : This code is in Python 3.8
PyTorch version : This code is in PyTorch 1.11.0

Note

The only difference from the origin project is the /src. So, please make directories as follows.

/bert_data
/models
/logs
/results

Example summary from BertSum models

CNN/DailyMail

The input document and gold summary are randomly picked from CNN/DailyMail testset. The sentences in red are extracted by the BertSumExt, and the sentences in green are similar to the generated summary of BertSumAbs. I find the summary of BertSumAbs tends to copy sentences from input documents. CNN/DailyMail dataset has extractive characteristics, so this phenomenon occurs when this model is fine-tuned with this dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PreSumm_Jeewoo

Note

Example summary from BertSum models

CNN/DailyMail

About

Releases

Packages

Languages

License

jeewoo1025/PreSumm_Jeewoo

Folders and files

Latest commit

History

Repository files navigation

PreSumm_Jeewoo

Note

Example summary from BertSum models

CNN/DailyMail

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages