Skip to content

Commit

Permalink
Fix a typo for the author's name in Simplification (sebastianruder#550)
Browse files Browse the repository at this point in the history
The author's last name is Zhu instead of Zu as shown in the link http://aclweb.org/anthology/C10-1152.
  • Loading branch information
orionHong authored Jul 5, 2021
1 parent 978e1d1 commit 76f92bf
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion english/simplification.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Since a simplification could involve text transformations beyond paraphrasing (w

#### PWKP / WikiSmall

[Zu et al. (2010)](http://aclweb.org/anthology/C10-1152) compiled a parallel corpus with more than 108K sentence pairs from 65,133 Wikipedia articles, allowing **1-to-1 and 1-to-N alignments**. The latter type of alignments represents instances of sentence splitting. The original full corpus can be found [here](https://www.informatik.tu-darmstadt.de/ukp/research_6/data/sentence_simplification/simple_complex_sentence_pairs/index.en.jsp). The test set is composed of 100 instances, with **one simplification reference per original sentence**. [Zhang and Lapata (2017)](http://aclweb.org/anthology/D17-1062) released a more standardised split of this dataset called [*WikiSmall*](https://github.com/XingxingZhang/dress), with 89,042 instances for training, 205 for development and the same original 100 instances for testing.
[Zhu et al. (2010)](http://aclweb.org/anthology/C10-1152) compiled a parallel corpus with more than 108K sentence pairs from 65,133 Wikipedia articles, allowing **1-to-1 and 1-to-N alignments**. The latter type of alignments represents instances of sentence splitting. The original full corpus can be found [here](https://www.informatik.tu-darmstadt.de/ukp/research_6/data/sentence_simplification/simple_complex_sentence_pairs/index.en.jsp). The test set is composed of 100 instances, with **one simplification reference per original sentence**. [Zhang and Lapata (2017)](http://aclweb.org/anthology/D17-1062) released a more standardised split of this dataset called [*WikiSmall*](https://github.com/XingxingZhang/dress), with 89,042 instances for training, 205 for development and the same original 100 instances for testing.

We present the models tested in this dataset **ranked by BLEU score** (or SARI if BLEU is not available). SARI cannot be reliably computed in this dataset since it does not contain multiple simplification references per original sentence. In addition, there are instances of more advanced simplification transformations (e.g. splitting) which SARI does not assess by definition.

Expand Down

0 comments on commit 76f92bf

Please sign in to comment.