The scDHA software package can perform cell segregation through unsupervised learning, dimension reduction and visualization, cell classification, and time-trajectory inference on single-cell RNA sequencing data.
- The package is now available on CRAN.
- The machine learning framework is changed from Tensorflow to Torch. Torch can be called directly from R. Python environment is no longer required.
- The package can be installed from CRAN or this repository.
- Using CRAN:
install.packages('scDHA')
- Using devtools:
- Install devtools:
utils::install.packages('devtools')
- Install the package using:
devtools::install_github('duct317/scDHA')
Or, install with manual and vignette:devtools::install_github('duct317/scDHA', build_manual = T, build_vignettes = T)
- Install devtools:
- When the package is loaded, it will check for the necessary
libtorch
:library(scDHA)
libtorch
can be installed using:torch::install_torch()
- Load the package:
library(scDHA)
- Load Goolam dataset:
data('Goolam'); data <- t(Goolam$data); label <- as.character(Goolam$label)
- Log transform the data:
data <- log2(data + 1)
- Generating clustering result:
result <- scDHA(data, seed = 1)
- The clustering result can be found here:
cluster <- result$cluster
- Calculating adjusted Rand Index using mclust package:
mclust::adjustedRandIndex(cluster,label)
- A detailed tutorial on how to use scDHA package is available at http://scdha.tinnguyen-lab.com/
Or, a vignette in R Notebook format is available here
To use our package for new data, the package includes these functions:
- scDHA: main function, doing dimension reuction and clustering. The input is a matrix with rows as samples and columns as genes.
- scDHA.vis: visualization. The input is demension reduction output.
- scDHA.pt: generating pseudotime. The input is demension reduction output.
- scDHA.class: classification new data using available one. The inputs consist of train data matrix, train data label and new data matrix.
- The result is reproducible by setting seed for these functions.
- More detail about parameters for each function could be found in the manual.
Duc Tran, Hung Nguyen, Bang Tran, Carlo La Vecchia, Hung N. Luu, Tin Nguyen (2021). Fast and precise single-cell data analysis using a hierarchical autoencoder. Nature Communications, 12, 1029. doi: 10.1038/s41467-021-21312-2 (link)