This workflow is designed to take either fastqs or gvcfs as input, and emit a joint-called multi-sample VCF. Please see Read the Docs for additional documentation.
You can find a small test dataset and pre-configured files for this pipeline here.
- Esha Joshi
- Cameron Palmer
- Bari Jane Ballew (@bballew)
If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository.
Clone this repository to your local system, into the place where you want to perform the data analysis.
git clone [email protected]:data-analysis5/54gene-wgs-germline.git
The pipeline inputs include:
- A configuration file
- A manifest file
- A list of intervals
- A sex linker file
- A MultiQC config file (provided)
If needed, install miniconda by following the steps here.
- Create a conda environment with, minimally, the dependencies defined in
environment.yaml
.
# create the env
conda env create -f environment.yaml
Activate the conda environment:
conda activate 54gene-wgs-germline
Test your configuration by performing a dry-run via
snakemake --use-conda -n
Execute the workflow locally via
snakemake --use-conda --cores $N
To run the pipeline in a cluster environment, edit wrapper.sh
as needed for your system, and then run via
bash run.sh
Alternatively, you may run snakemake pipelines on a cluster via something like this
snakemake --use-conda --cluster sbatch --jobs 100
Upon pipeline completion, verify that all steps have completed without error by checking the top-level Snakemake log. The bottom few lines of should contain something like nnn of nnn steps (100%) done
. Additional job logs (when run on a cluster) are stored in the logs/
directory.
All pipeline results are stored in the results/
directory.
The hard-filtered, joint-called VCF can be found in results/HaplotypeCaller/filtered/HC_variants.hardfiltered.vcf.gz
.
For future joint-calling, the gVCFs are located at results/HaplotypeCaller/called/<sample>_all_chroms.g.vcf.gz
.
Deduplicated and post-BQSR bams are found at results/bqsr/<sample>.bam
.