A scalable SCENIC workflow for single-cell gene regulatory network analysis

Overview

Quick start
Requirements
Installation
Full pipeline documentation
Case studies
- PBMC 10k dataset (10x Genomics)
  - Full SCENIC analysis, plus filtering, clustering, visualization, and SCope-ready loom file creation:
    - Jupyter notebook | HTML render
  - Extended analysis post-SCENIC:
    - Jupyter notebook | HTML render
- Cancer data sets
  - Jupyter notebook | HTML render
- Mouse brain data set
  - Jupyter notebook | HTML render
References and more information

Quick start

Running the pipeline in a Jupyter notebook

We recommend using this notebook as a template for running an interactive analysis in Jupyter. See the installation instructions for information on setting up a kernel with pySCENIC and other required packages.

Running the Nextflow pipeline on the example dataset

Requirements (Nextflow/containers)

The following tools are required to run the steps in this Nextflow pipeline:

Nextflow
A container system, either of:
- Docker
- Singularity

The following container images will be pulled by nextflow as needed:

Using the test profile

A quick test can be accomplished using the test profile, which automatically pulls the testing dataset (described in full below):

nextflow run aertslab/SCENICprotocol \
    -profile docker,test

This small test dataset takes approximately 70s to run using 6 threads on a standard desktop computer.

Download testing dataset

Alternately, the same data can be run with a more verbose approach (this is more illustrative for how to substitute other data into the pipeline). Download a minimum set of SCENIC database files for a human dataset (approximately 78 MB).

mkdir example && cd example/
# Transcription factors:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/test_TFs_tiny.txt
# Motif to TF annotation database:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/motifs.tbl
# Ranking databases:
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/genome-ranking.feather
# Finally, get a tiny sample expression matrix (loom format):
wget https://raw.githubusercontent.com/aertslab/SCENICprotocol/master/example/expr_mat_tiny.loom

Running the example pipeline

Either Docker or Singularity images can be used by specifying the appropriate profile (-profile docker or -profile singularity). Please note that for the tiny test dataset to run successfully, the default thresholds need to be lowered.

Using loom input

nextflow run aertslab/SCENICprotocol \
    -profile docker \
    --loom_input expr_mat_tiny.loom \
    --loom_output pyscenic_integrated-output.loom \
    --TFs test_TFs_tiny.txt \
    --motifs motifs.tbl \
    --db *feather \
    --thr_min_genes 1

By default, this pipeline uses the container specified by the --pyscenic_container parameter. This is currently set to aertslab/pyscenic:0.9.19, which uses a container with both pySCENIC and Scanpy 1.4.4.post1 installed. A custom container can be used (e.g. one built on a local machine) by passing the name of this container to the --pyscenic_container parameter.

Expected output

The output of this pipeline is a loom-formatted file (by default: output/pyscenic_integrated-output.loom) containing:

The original expression matrix
The pySCENIC-specific results:
- Regulons (TFs and their target genes)
- AUCell matrix (cell enrichment scores for each regulon)
- Dimensionality reduction embeddings based on the AUCell matrix (t-SNE, UMAP)
Results from the parallel best-practices analysis using highly variable genes:
- Dimensionality reduction embeddings (t-SNE, UMAP)
- Louvain clustering annotations

General requirements for this workflow

Python version 3.6 or greater
Tested on various Unix/Linux distributions (Ubuntu 18.04, CentOS 7.6.1810, MacOS 10.14.5)

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
bin		bin
conf		conf
docs		docs
example		example
notebooks		notebooks
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
requirements.txt		requirements.txt
scenic_protocol.yml		scenic_protocol.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A scalable SCENIC workflow for single-cell gene regulatory network analysis

Overview

Quick start

Running the pipeline in a Jupyter notebook

Running the Nextflow pipeline on the example dataset

Requirements (Nextflow/containers)

Using the test profile

Download testing dataset

Running the example pipeline

Using loom input

Expected output

General requirements for this workflow

References and more information

SCENIC

SCope

Scanpy

About

Releases

Packages

Languages

License

duosu/SCENICprotocol

Folders and files

Latest commit

History

Repository files navigation

A scalable SCENIC workflow for single-cell gene regulatory network analysis

Overview

Quick start

Running the pipeline in a Jupyter notebook

Running the Nextflow pipeline on the example dataset

Requirements (Nextflow/containers)

Using the test profile

Download testing dataset

Running the example pipeline

Using loom input

Expected output

General requirements for this workflow

References and more information

SCENIC

SCope

Scanpy

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages