Skip to content

Latest commit

 

History

History
 
 

LM_Pred_Dataset

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

LMPred - AMP_Dataset

A dataset containing 3,758 AMP and 3,758 non-AMP datapoints constructed from AMPs sourced from the DRAMP 2.0 database, as well as pre-existing AMP Prediction research. Non-AMPs were sourced from UniProt.

The data has been split: 40% training, 20% validation and 40% test set. The distribution of target classes across these datasets is shown below:

Distribution of AMP and Non-AMP Samples in the Training, Validation and Test Sets:

fig_13_train_test_val

All AMP and Non-AMPs overlaid to show overall distribution in datasets:

fig12_overlaid_dist