NECARE

NEtwork-based CAncer PPI pREdiction

NECARE is a network-based algrithom which use Relational Graph Convolutional Network (R-GCN) to predict protein-protein inetraction in cancer

The feature NECARE used include:1) knowledge-based feature OPA2Vec; 2)cancer specific feature which means mutation and expression profile of each gene from TCGA

The ouput is binary (1 or 0) and directional.

For example, if the input genes are 'TP53 KRAS' (tab-delimited), then output '1' means TP53 has a interaction to KRAS.

How to install

NECARE is programmed bsaed in pytorch(with cuda)(python=3.7.4), so you need to install pytoch first:

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

After then, you need install:

numpy=1.16.5

dgl-cu101=0.4.1

To be eaier, you can also simply apply my conda environment by :

conda create --name necare --file requirements.txt

or:

conda env create -f environment.yml

How to use

Example:

For predict: NECARE.py -i ./dataset/test_pred.txt -o ./

For model training: NECARE.py -t True -i ./dataset/test_trn.txt -g ./dataset/NECARE.graph -f ./dataset/NECARE_features.txt -s 0.1 -b 10 -e 10

Options:

-h, --help show this help message and exit

-t TRAINING Turn on training model of NECARE to train your own modle(True/False), default is False

-i FILENAME Iutput file (tab-delimited text file) contains the pairs of input inetracitons. The first column for source genes, the second column for target genes. If training model is on, it need a third column for labels

-o PATH path of the directory to save the prediction or trained model, defalt is current directory

-m MODEL The path of the modle for prediciton, default model is the one we reported in NECARE paper (if using default, parameters -m and -g will be ignored). -t True is incompatible with -m

-g GRAPH General gene relationship network,tab-delimited text file, the first column for source genes, the second column for target genes, the third column for inetraction types. Default is the one used in NECARE paper

-f FEATURE Features for the nodes, tab-delimited text file, the first column for gene names, Default is the what used in NECARE paper (OPA2Vec+TCGA)

-e EPOCH Number of epoch (Only work for training model), default 100

-r LEARNING_RATE Learning rate (Only work for training model), default 0.01

-l HIDDEN_LAYER Number of hidden layer (Only work for training model), default 2

-n HIDDEN_NODE Number of hidden node (Only work for training model), default 100

-d DROPOUT Rate of drapout (Only work for training model), default 0.2

-b BASE Number of bases (Only work for training model), default 1

-s BATCH Batch size (Only work for training model), default 0.2 (20% of general gene relationship network)

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
dataset		dataset
necareMod		necareMod
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE		LICENSE
NECARE.py		NECARE.py
README.md		README.md
requirements.txt		requirements.txt
requirements.yaml		requirements.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NECARE

How to install

How to use

About

Releases

Packages

Languages

License

JiajunQiu/NECARE

Folders and files

Latest commit

History

Repository files navigation

NECARE

How to install

How to use

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages