Updated software package here
Since the initial release of this software package, we have worked with collaborators to develop an improved codebase for this project which is publicly available on github here.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation uses one container per process, making it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules to make them available to all nf-core pipelines, and to everyone within the Nextflow community!
A statistical, reference-free algorithm subsumes myriad problems in genome science and enables novel discovery
- Install Java.
- Install
nextflow
(>=20.04.0
). - Depending on your use case, install
conda
,docker
, orsingularity
. By using thedocker
orsingularity
nextflow profile, the pipeline can be run within the SPLASH docker container (also available on dockerhub, which contains all the required dependencies.
To test this pipeline, use the command below. The test
profile will launch a pipeline run with a small dataset.
How to run with singularity:
nextflow run salzmanlab/nomad \
-profile test,singularity \
-r main \
-latest
How to run with docker:
nextflow run salzmanlab/nomad \
-profile test,docker \
-r main \
-latest
How to run with conda:
nextflow run salzmanlab/nomad \
-profile test,conda \
-r main \
-latest
Please see this document for descriptions of SPLASH inputs and parameters.
Please see this document for descriptions of SPLASH output.
Kaitlin Chaung*, Tavor Baharav*, George Henderson, Ivan Zheludev, Peter Wang, Julia Salzman. SPLASH: a statistical, reference-free genomic algorithm unifies biological discovery , Cell (2023)
Tavor Baharav, David Tse, and Julia Salzman. An Interpretable, Finite Sample Valid Alternative to Pearson’s X2 for Scientific Discovery, bioRxiv (2023)
This pipeline uses code and infrastructure developed and maintained by the nf-core initiative, and reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.