rmappet

Introduction

rmappet is a nextflow pipeline for parallel alternative splicing analysis of bulk, short-read RNA sequencing data using both rMATS and Whippet. Splicing events reported by each tool are then overlapped by location to identify shared events, providing additional confidence when interpreting the results. Both single- and paired-end data is supported.

Pipeline summary

Raw read quality control and trimming (Fastp)
rMATS - Alternative splicing analysis
1. Build STAR genome index (STAR)
2. Align trimmed reads (STAR)
3. Sort and index alignments (Samtools)
4. Alternative splicing analysis using rMATS (rMATS)
5. Standardize rMATS results
Whippet - Alternative splicing analysis
1. Build whippet genome index (Whippet)
2. Quantify splicing events (Whippet)
3. Alternative splicing analysis using Whippet (Whippet)
Overlap splicing coordinates

Get started

Install Nextflow (>=22.10.3)
Install Docker for local execution
Install Singularity for cluster execution

Download and test rmappet in stub mode:

nextflow run didrikolofsson/rmappet -profile test,docker -stub

Pipeline execution

The pipeline can currently be executed locally using docker or on distributed computing clusters using SLURM and singularity. Software dependencies are resolved using pre-built docker and singularity images, removing the need for users to manage their own dependenices. This section provides information about the required pipeline inputs and how to execute the pipeline in the supported environments.

Required inputs

Parameter file

A parameter file with necessary settings and file paths must be supplied when executing the pipeline. The parameter file should be in YAML format and contain the following information:

dev - Run the pipeline in development mode using a single sample for testing
samplesheet - Path to sample sheet in csv format
genome - Path to genome fasta
annotation - Path to genome annotation in GTF format
outputdir - Path to output directory
readlen - Read length
libtype - Library type

Example parameter files for both single and paired end experiments can be found in the /data folder.

Sample sheet

A sample sheet with information about the experimental design should be included together with the parameter file. The sample sheet should be in CSV format and contain the following columns and information:

sample_id	read1	read2	condition
sample1	path/to/sample1_1.fastq.gz	path/to/sample1_2.fastq.gz	condition_a
sample2	path/to/sample2_1.fastq.gz	path/to/sample2_2.fastq.gz	condition_a
sample3	path/to/sample3_1.fastq.gz	path/to/sample3_2.fastq.gz	condition_a
sample4	path/to/sample4_1.fastq.gz	path/to/sample4_2.fastq.gz	condition_b
sample5	path/to/sample5_1.fastq.gz	path/to/sample5_2.fastq.gz	condition_b
sample6	path/to/sample6_1.fastq.gz	path/to/sample6_2.fastq.gz	condition_b

Examples of sample sheets can be found in the /data folder.

Local execution

Execute the pipeline on a local computer using docker by running the following command. Make sure that the docker daemon is running before launch to avoid errors.

nextflow run didrikolofsson/rmappet -profile docker -params-file path/to/params.yaml

Cluster execution

Execute the pipeline on a distributed computing cluster by running the following command. Make sure that the singularity command is accessible on the head node before launch to avoid errors, e.g call module load singularity on clusters with a module system.

nextflow run didrikolofsson/rmappet -profile slurm,singularity -params-file path/to/params.yaml

Pipeline output

The rmappet pipeline generates a set of output folders and files containing results from the various processing steps. The pipelines output is structured as follows:

outputdir
├── fastp
│   ├── sample1.fastp.html
│   └── sample1.fastp.json
├── overlap
│   ├── condition_a_vs_condition_b.rmats_only.csv
│   ├── condition_a_vs_condition_b.rw_overlap.csv
│   ├── condition_a_vs_condition_b.whippet_only.csv
│   ├── condition_a_vs_condition_b.significant.rmats_only.csv
│   ├── condition_a_vs_condition_b.significant.rw_overlap.csv
│   └── condition_a_vs_condition_b.significant.whippet_only.csv
├── rmats
│   ├── results
│   │   ├── condition_a_vs_condition_b.jc.tsv
│   │   ├── condition_a_vs_condition_b.jcec.tsv
│   │   ├── condition_a_vs_condition_b.significant.jc.tsv
│   │   └── condition_a_vs_condition_b.significant.jcec.tsv
│   └── run
│       └── condition_a_vs_condition_b
│           └── condition_a_vs_condition_b.txt
├── samtools
│   └── sort
│       ├── sample1.sortedByCoord.bam
│       └── sample1.sortedByCoord.bam.bai
├── star
│   └── alignments
│       ├── sample1.Log.final.out
│       ├── sample1.ReadsPerGene.out.tab
│       └── sample1.SJ.out.tab
└── whippet
    ├── delta
    │   ├── condition_a_vs_condition_b.diff.gz
    │   └── condition_a_vs_condition_b.significant.tsv
    └── quant
        ├── sample1.gene.tpm.gz
        ├── sample1.isoform.tpm.gz
        ├── sample1.jnc.gz
        ├── sample1.map.gz
        └── sample1.psi.gz

Troubleshooting

Please note that rmappet is currently under active development, and we are still working to fix bugs and add features. If you have any questions, suggestions, or issues, please feel free to contact us or open an issue.

Contact

Didrik Olofsson ([email protected])

Dr. Alexander Neumann ([email protected])

Prof. Dr. Florian Heyd ([email protected])

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.vscode		.vscode
bin		bin
conf		conf
data		data
docker		docker
modules		modules
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rmappet

Introduction

Pipeline summary

Get started

Pipeline execution

Required inputs

Parameter file

Sample sheet

Local execution

Cluster execution

Pipeline output

Troubleshooting

Contact

About

Releases

Packages

Languages

License

didrikolofsson/rmappet

Folders and files

Latest commit

History

Repository files navigation

rmappet

Introduction

Pipeline summary

Get started

Pipeline execution

Required inputs

Parameter file

Sample sheet

Local execution

Cluster execution

Pipeline output

Troubleshooting

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages