Skip to content

Benchmarking feature selection methods for scRNA-seq and spatially resolved transcriptomics

License

Notifications You must be signed in to change notification settings

ToryDeng/FeatureSelectionBenchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FeatureSelectionBenchMarks

This is the code repository of the feature (gene) selection benchmark in both scRNA-seq and spatial transcriptomics.

Software features

After a simple configuration, you can run the benchmark (including data loading, quality control, feature selection, and cell clustering/domain detection) in one single line of code:

from benchmark.run_benchmark import run_bench


# configure the dataset information
data_cfg = {
    'your_data_name': {
        'adata_path': 'path/to/h5ad/file',
        'annot_key': 'annotation_name',
    }}
# configure feature selection methods and numbers of selected features
fs_cfg = {'feature_selection_method': [1000, 2000]}
# configure clustering methods and numbers of runs
cl_cfg = {'clustering_method': 2}
# run the benchmark in one line of code
run_bench(data_cfg, fs_cfg, cl_cfg, modality='scrna', metrics=['ARI', 'NMI'])

The evaluation results will be automatically saved as an XLSX file in the working directory:

2023-02 14_54_32 scrna.xlsx

Other software features are:

  • Automatically save the results of each step (preprocessed data, selected features, and cluster labels)
  • Reload the cached genes and cluster labels when you use the same data (specified by the data name)
  • Support custom feature selection and cell clustering/domain detection methods
  • Present detailed and pretty logging messages based on rich and loguru

Currently supported methods

scRNA-seq

Feature selection

Name Language Reference
GeneClust Python paper
vst Python paper
mvp Python paper
triku Python paper
GiniClust3 Python paper
SC3 Python paper
scran R paper
FEAST R paper
M3Drop R paper
scmap R paper
deviance R paper
FEAST R paper
sctransform R paper

Cell clustering

Name Language Reference
SC3s Python paper
Seurat R paper
SHARP R paper
TSCAN R paper
CIDR R paper

spatial transcriptomics

Feature selection

Name Language Reference
SpatialDE Python paper
SPARK-X R paper

domain detetcion

Name Language Reference
SpaGCN Python paper
stLearn Python paper

Requirements

R packages

This benchmark is written in Python and call R functions through rpy2. If you want to use some methods implemented with R language, please install the corresponding R packages.

Python packages

  • anndata>=0.8.0
  • numpy>=1.21.6
  • scanpy>=1.9.1
  • loguru>=0.6.0
  • anndata2ri>=1.1
  • sc3s>=0.1.1
  • rpy2>=3.5.6
  • scikit-learn>=1.2.0
  • SpaGCN>=1.2.5
  • torch>=1.13.1
  • stlearn>=0.4.11
  • pandas>=1.5.2
  • opencv-python>=4.6.0
  • scipy>=1.9.3
  • rich>=13.0.0
  • triku>=2.1.4
  • statsmodels>=0.13.5
  • SpatialDE>=1.1.3

Installation

git clone https://github.com/ToryDeng/FeatureSelectionBenchmarks
cd FeatureSelectionBenchmarks/
python3 setup.py install --user

Tutorial

Coming soon.

About

Benchmarking feature selection methods for scRNA-seq and spatially resolved transcriptomics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published