Skip to content

Morvan98/PSTP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The current version of PSTP offers residue-level and sequence-level phase separation predictions via PSTP-Scan, along with embedding metrics generated by the dual-language embedding model. Both features are compatible with standard CPUs and GPUs. Additional functions will be available soon.

Python requirements

python >= 3.7; python <= 3.9

Please install the following packages in the order provided:

pip install cython  ### required by ALBATROSS when installing
pip install numpy  ### required by ALBATROSS when installing
pip install git+https://[email protected]/idptools/sparrow.git  ### ALBATROSS more detail:https://github.com/idptools/sparrow
pip install idptools-parrot[optimize]  ### ALBATROSS more detail: more detail:https://github.com/idptools/sparrow
pip install "fair-esm[esmfold]" ### ESM-2

Installation

pip install git+https://[email protected]/Morvan98/PSTP.git

Usage and examples

single sequence prediction

Here, we provide examples of sequence-level and residue-level phase separation predictions for a single sequence input.

from pstp.pstp_collections import pstp_scan_saps_prediction # saps: self-assembly ps model
from pstp.pstp_collections import pstp_scan_pdps_prediction # pdps: partner-dependent ps model
from pstp.pstp_collections import pstp_scan_mix_prediction # mix: mix-dataset ps model
test_seq = 'GRGDSPYS'*25 ## protein sequence example
saps_residue_level_scores, saps_scan_predicted, saps_scan_kernel_predicted = pstp_scan_saps_prediction(
    test_seq)
pdps_residue_level_scores, pdps_scan_predicted, pdps_scan_kernel_predicted = pstp_scan_pdps_prediction(
    test_seq)
mix_residue_level_scores, mix_scan_predicted, mix_scan_kernel_predicted = pstp_scan_mix_prediction(
    test_seq)

batch embedding and prediction

Here, we provide examples of sequence-level and residue-level phase separation predictions for multiple sequence inputs. The pstp_embedding_by_batch function can also generate PSTP embeddings for custom sequences.

from pstp.pstp_collections import pstp_embedding_by_batch
from pstp.pstp_collections import predict_by_saps_models,predict_by_pdps_models,predict_by_mix_models # PSTP-Scan models
from pstp.pstp_collections import saps_kernel,pdps_kernel,mix_kernel # trained MLP kernels of PSTP-Scan
seqs = ['GRGDSPYS'*25, ## protein sequences example
        'ARADSPYS'*25,
        'SRSDSPYS'*25,
        'GRGDSPYS'*24,
        'GRGDSPYS'*23,
        'GRGDSPYS'*20,
        ]
seq_matrix_lst = pstp_embedding_by_batch(seqs) # residue-level embedding
for matrix in seq_matrix_lst:
    print(matrix.shape) # (sequence_length, 650)
### PSTP-Scan predictions
saps_py_lst,pdps_py_lst,\
    mix_py_lst = [],[],[]
for matrix_ in seq_matrix_lst:
    res_p,p = predict_by_saps_models(matrix_)
    saps_py_lst.append(p)
    res_p,p = predict_by_pdps_models(matrix_)
    pdps_py_lst.append(p)
    res_p,p = predict_by_mix_models(matrix_)
    mix_py_lst.append(p)
print(saps_py_lst,pdps_py_lst,mix_py_lst)
### kernel predictions
saps_py_lst,pdps_py_lst,mix_py_lst = [],[],[]
for matrix_ in seq_matrix_lst:
    p = saps_kernel(matrix_) 
    saps_py_lst.append(p)
    p = pdps_kernel(matrix_)
    pdps_py_lst.append(p)
    p = mix_kernel(matrix_)
    mix_py_lst.append(p)

print(saps_py_lst,pdps_py_lst,mix_py_lst)

Paper link: https://doi.org/10.1101/2024.10.29.620820 A new version update is coming soon!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages