Skip to content

Hyunkushin/OTMC_NatComm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OTMC_NatComm

Overview

These codes were written for a scientific paper 'Single test-based early diagnosis of multiple cancer types using Exosome-SERS-AI' and demo of the results. Unauthorized use of this code for other purpose is prohibited. This repository contains Python codes to show example of cancer detector and TOO detector using exosomal SERS signals. In ./DataBase_demo directory, sample data for demo of the codes are included. Also, codes for drawing main figures and calculate diagnostic performance are included. For the weight values used in these models, commercial use through transfer learning or modulation beyond the scope of the license is prohibited. The SERS database for demo, not code, is for demonstrating the operation of the model, and unauthorized or commercial use is prohibited. For non-commercial purposes or use of the database for research, please contact the author of the paper.

System requirement

  • Python 3.8.8
  • Tensorflow 2.5.0
  • Pandas 1.4.2
  • Scikit-learn 0.24.1
  • Matlab R2021a
  • All python codes are recommended using python IDE (e.g. PyCharm, Spyder)
  • After the IDE to run the code and the operating environment above are built, any other installation is not required.

Source data and models

This repository includes source data for reproduction of the main figures. The source data containing the numerial values for the figures is stored as an excel file in ./data_generator/source_data directory. The implemented and optimized models for cancer diagnosis, TOO discrimination, and multi-layer perceptron is stored in the same directory.

Reproduction of data

All .py or .m files starting with 'fig' in the ./data_generator directory are codes written for re-implementation of figure. If you run the code through the recommended Python IDE or MatLab, you can check the original figure data.

extract_ROC_CI.py is a file to calculate statistics for quantifying the diagnostic performance of our model. This code generates the output, including AUC of ROC, sensitivity, specificity, accuracy, and precision value with a 95% confidence interval. This code offers the major result on cancer presence detection (Table 1) and TOO discrimination (Table 2). CI can be derived differently for each trial depending on the random seed value (Default = 777), and the repeated sampling number for bootstrapping was set to 1000.

Decision maker

decision_maker.py is a programmed code that shows an example of operation to identify both cancer presence and TOO detection. This decision algorithm is based on the pretrained models and weights in ./data_generator/source_data directory. The output will be displayed in console with ID, true label, and prediction result, like below. 'LUAD', 'BRCA', 'COAD', 'LIHC', 'PAAD', and 'STAD' are assigned to Lung, Breast, Colon, Liver, Pancreas, and Stomach cancer, respectively. The running time for the prediction of a single sample takes between 0.1~2 seconds, depending on the PC specifications.

  • Output example
[ID: 1Col13]
True label: COAD
==> Prediction: Cancer detected.
    TOO decision: COAD (Correct)