Skip to content
/ Stator Public

STATOR: A Nexflow pipeline to infer cell types, subtypes, and states from gene expression data.

License

Notifications You must be signed in to change notification settings

AJnsm/Stator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stator

Preprint now available at https://www.biorxiv.org/content/10.1101/2023.12.18.572232v1

NOTE: this repo will soon move to the account of the Edinburgh Biomedical AI Lab

NOTE: the latest version of Stator (V1.2) is now available on the develop branch, and should be more stable and easy-to-use. Pending some final tests, it will be merged into main soon, but you can already use it by pulling the develop branch instead (nextflow pull AJnsm/Stator -r develop).

Table of contents

Introduction

The Stator pipeline takes in single cell RNA-seq count matrices, and estimates gene-gene interactions at up to seventh order. Up to fifth-order interactions are then used to find characteristic, multi-type states present in the cell population.

In our research, Stator found cell identities that were invisible to clustering and NMF methods. Stator found sub-phases of the cell cycle, future neuronal fate states, and liver cancer states predictive of patient survival.

The pipeline can be run directly from the command line, both locally or on an HPC cluster. It pulls all code from Github, and the required containers from Dockerhub.

Subsequent analysis can be done with our bespoke Stator Shiny app, available at https://shiny.igc.ed.ac.uk/MFIs/

Docs

Documentation on installation and usage are available here.

A small tutorial/vignette is available here.

Changes in this version (V1.1)

  • Switched to Nextflow DSL2 and the latest compatible Nextflow version (23.04)
  • Changed file handling inside Stator.nf to fix read permission bug on some clusters
  • Renamed multiple scripts, files, and directories
  • Updated and tested Docker profile for local runs
  • Removed 1-point calculation
  • Simplified Nextflow config files
  • Removed conda yamls
  • Switched back to numpy (was numba in V1.0) for interaction estimation (might be reverted in the future, depending on performance)

To do

  • switch to using CIs for significance estimation, abandon F-value.
  • update vignette
  • improve documentation and tutorial
  • Create more unit tests for state inference
  • add option to also use pairwise interactions for state inference
  • if pairwise not used, optionally skip their computation.

About

STATOR: A Nexflow pipeline to infer cell types, subtypes, and states from gene expression data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages