% Transfer Learning is not a Silver Bullet: A Case Study on Medical Relation Extraction
This dataset provides a corpus of assertions in clinical discharge summaries. The task is split into six classes, namely present, possible, absent, hypothetical, conditional and associated with someone else. However, the distribution is highly skewed, such that only 6% of the assertions belong to the latter three classes. Hence we only use the present, possible, and absent assertions for our evaluation as they present the most important information for doctors.
From [1].
This is a corpus of assertions in biomedical publications. It was specifically curated for the study of negation and speculation (or absent and possible in this paper) scope and does not contain present annotations. The BioScope dataset does not completely match the information need of health professionals and the i2b2 corpus lacks varied medical text types.
From [1].
provides texts from discharge summaries as well as other clinical notes (physician letters, nurse letters, and radiology reports) representing a promising source of varied medical text. Therefore, two annotators followed the annotation guidelines from the i2b2 challenge, and labelled 5,000 assertions, i.e. word spans of entities and their corresponding present / possible / absent class.
From [1].
Taken from stories by Sir Author Conan Doyle (literary work)
A collection of product reviews (free text by human users)