A dataset containing 3,758 AMP and 3,758 non-AMP datapoints constructed from AMPs sourced from the DRAMP 2.0 database, as well as pre-existing AMP Prediction research. Non-AMPs were sourced from UniProt.
The data has been split: 40% training, 20% validation and 40% test set. The distribution of target classes across these datasets is shown below: