Note: Requires Python 3.8+
This repository contains two main scripts for the preProcessing of the Whole Slide Images (WSIs) as an initial step for histopathological deep learning.
- Install openslide on Fedora via:
dnf install openslide-tools
. - Set up python environment with
pip install -r requirements.txt
. - extractTiles-ws : This script is used to tessellate the WSIs. The main required inputs for this function:
Input Variable name | Description |
---|---|
-s | Path to the WSI folder |
-o | Path to the output folder where tiles are saved |
--skipws | Skip tessellation of WSI if annotation is missing. Default value is False. |
-px | Size of image patches to analyze, in pixels |
-um | Size of image patches to analyze, in microns. |
--num_threads | Number of threads to use when tessellating. |
--augment | Augment extracted tiles with flipping/rotating. |
--ov | The Size of overlappig for extracted tiles. It can be values between 0 and 1. |
- Normalize: This script is used to normalize the extracted tiles using Macenko method. The main required inputs for this function:
Input Variable name | Description |
---|---|
-ip or --inputPath | Input path of the to-be-normalised tiles |
-op or --outputPath | Output path to store normalised tiles |
-si or --sampleImagePath | Image used to determine the colour distribution, uses GitHub one by default |
-nt or --threads | Number of threads used for processing, 2 by default |
-pl or --patientList | Clini table containing PATIENT and FILENAME to normalise |
usage: python Normalize.py -ip INPUTPATH -op OUTPUTPATH [-si SAMPLEIMAGEPATH] [-nt THREADS] [-pl CLINITABLE]
In this script, we are using the Macenko normalization method from https://github.com/wanghao14/Stain_Normalization.git repository.