Parliament2 identifies structural variants in a given sample relative to a reference genome. These structural variants cover large deletion events that are called as Deletions of a region, Insertions of a sequence into a region, Duplications of a region, Inversions of a region, or Translocations between two regions in the genome.
Parliament2 runs a combination of tools to generate structural variant calls on whole-genome sequencing data. It can run the following callers: Breakdancer, Breakseq2, CNVnator, Delly2, Manta, and Lumpy. Because of synergies in how the programs use computational resources, these are all run in parallel. Parliament2 will produce the outputs of each of the tools for subsequent investigation.
If the option to genotype candidates is selected, this tool will run SVTyper to genotype the events and will merge these tools with the program SURVIVOR. SV events that receive genotype calls have significantly higher specificity.
If the option to visualize candidates is selected, a .tar.gz
file containing PDF images for each SV call will be produced in order to summarize the supporting information behind the SV.
You must have Docker installed in order to run Parliament2. No other dependencies are required. Please see the Docker documentation for installation information.
This software uses an Apache 2.0 license, which is contained within this Github repo.
Parliament2 is meant to be run as a Docker image and is available on DockerHub. The simplest way to install Parliament2 on your machine is to run the command docker pull dnanexus/parliament2:<TAG>
, where the <TAG>
is the version of Parliament2 you wish to use. You can see the full list of Parliament2 versions available on DockerHub at the Parliament2 DockerHub repository.
You can verify the successful installation using the command docker images
, which will list every Docker image stored locally on your machine, or by running docker run dnanexus/parliament2:<TAG> -h
, which will print the help string (printed in full at the bottom of this README).
If you wish to build Parliament2 from source, you can find the most up-to-date production version of Parliament2 on our release page or pull from this repository (warning: this version may be unstable; the official releases will be more stable).
Once you have made your desired modifications (if applicable), run:
cd parliament2
tar -czvf resources.tar.gz resources/
docker build . -t parliament2
There should now be a Docker image named parliament2
on your computer.
Again, you can verify the successful installation using the command docker images
, which will list every Docker image stored locally on your machine, or by running docker run parliament2 -h
, which will print the help string (printed in full at the bottom of this README).
To run this Docker image, you must have:
- A local file with Illumina reads on which you want to call structural variants (
*.bam
) - A reference genome file to which the BAM file was mapped (
*.fa.gz
or*.fasta.gz
)
You may also have:
- The index for the BAM file (
*.bai
) - The index for the reference genome file (
*.fai
)
This app is intended for whole-genome sequencing. Providing exome or panel sequencing will not produce good results. It is intended to be run on a single germline sample.
To run Parliament2, use the following command-line call:
docker run -v <LOCAL_DIR_WITH_INPUTS>:/home/dnanexus/in -v <LOCAL_DIR_FOR_OUTPUTS>:/home/dnanexus/out dnanexus/parliament2:<TAG> --bam <BAM_NAME> --bai <INDEX_NAME> --fai <REFERENCE_INDEX> -r <REFERENCE_NAME> <OPTIONAL_ARGUMENTS>
You must mount two local volumes -- one that contains the inputs (*.bam
and *.fa.gz
) and one in which you wish to place the files generated by Parliament2. The --bai
and --fai
flags, along with the BAM and FASTA index files, are optional, but including either or both will speed up runtime considerably. At least one optional argument must be specified; see below for more information on the optional arguments.
For example, if your current working directory is named /home/dnanexus/
and looks like this:
.
├── input.bam
├── input.bai
├── ref_genome.fa.gz
└── outputs
where outputs
is the directory you wish Parliament2 to place its results files, the command you would use to run Parliament2 could be:
docker run -v /home/dnanexus/:/home/dnanexus/in -v /home/dnanexus/outputs/:/home/dnanexus/out dnanexus/parliament2<TAG> --bam input.bam --bai input.bai -r ref_genome.fa.gz --breakdancer --cnvnator --manta --genotype
This app will output a number of files, representing the outputs of each of the structural variant callers. If the option to run a given step is unselected, then those outputs will not be provided.
-
lumpy.vcf
: representing the structural variant calls from Lumpy in VCF format. -
lumpy.discordant.bam
: representing reads that Lumpy identified as discordant and used as evidence for its calls. -
lumpy.splitters.bam
: representing reads that Lumpy identified as split-read mapped and used as evidence for its calls. -
manta.diploidSV.vcf
: representing genotyped structural variant calls from Manta in VCF format. -
manta.alignmentStats.txt
: representing statistics about the alignment from Manta. -
breakdancer.ctx
: representing structural variant calls in Breakdancer's format. -
cnvnator.output
: representing structural variant calls in CNVnator's format. -
cnvnator.vcf
: representing structural variant calls from CNVnator in VCF format (this represents the conversion of CNVnator output to VCF format). -
breakseq.gff
: representing structural variant calls from Breakseq2 in GFF format. -
breakseq.vcf
: representing structural variant calls from Breakseq2 in VCF format. -
breakseq.bam
: representing the reads mapping used as evidence for the calls generated by Breakseq2. -
delly.deletion.vcf
: representing deletion calls made by Delly2. -
delly.inversion.vcf
: representing inversion calls made by Delly2. -
delly.duplication.vcf
: representing duplication calls made by Delly2. -
delly.insertion.vcf
: representing insertion calls made by Delly2. -
delly.translocation.vcf
: representing translocation calls made by Delly2. -
<caller>.svtyped.vcf
: If the option to genotype candidates is selected, a genotype VCF produced by SVTyper will be generated for each caller output. -
combined.genotyped.vcf
: If the option to genotype candidates is selected, a merged VCF file of all of the callers will be produced by SURVIVOR. -
svviz_outputs.tar.gz
: If the option to visualize events is selected, a tarball containing a set of PDFs documenting the genomic regions for calls will be generated
Parliament2 is available as an app on DNAnexus at https://platform.dnanexus.com/app/parliament2 (note: a DNAnexus account is required to access this link; you can create one at https://platform.dnanexus.com/login). The documentation for the app is included both on the DNAnexus platform and in the dx_app_code
directory of this repository. A DNAnexus account is required to access the platform.
To run Parliament2 on DNAnexus, your input BAM file must be already on the DNAnexus platform. To run Parliament2 using the graphic interface, simply click the "Run" button from the app page and select your inputs. To run Parliament2 using the command-line interface, run the command dx run parliament2 -h
and follow the guide generated. For more information on running executables on DNAnexus, see the guide to running apps and applets.
General information on using DNAnexus can be found in the official documentation.
To build Parliament2 on your own on DNAnexus, you will have to have built the Docker image locally (see Installing). Then:
-
Run
dx-docker create-asset parliament2
. This will take approximately 45 minutes and will generate a string that you can copy-paste into thedxapp.json
file found in thedx_app_code/parliament2
directory under the "Regional Options" section for your region. -
Run
dx build parliament2
to build the applet from within thedx_app_code
directory.
For more information about using DNAnexus, see the following links:
General information on using DNAnexus can be found in the official documentation.
To modify Parliament2 and run it on DNAnexus, please see the developer README in the dx_app_code/parliament2
directory of this repository.
The tool runs but I am getting a samtools error:
samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1
What's going on?
This is a known error message caused by splitting the BAM file. It doesn't affect the results in any way.
The tool runs but I am getting the following warning:
breakseq2 -2.2- has requirement pysam==0.7.7, but you'll have pysam 0.15.1 which is incompatible
What's going on?
This is a known error message caused by how we currently manage the conflicting pysam versions required for BreakSeq and SVTyper. This issue should be resolved in v0.1.10.
The tool fails or runs for a long time and I see an error message in the log:
Out of memory: Kill process XXXX (svviz) score 112 or sacrifice child
What's going on?
The svviz tool can at times consume a large amount of memory, causing the application to be killed. If you see this message in the log, you should kill the job if it is still running and retry on a machine with more memory or without enabling svviz. This issue should be resolved in v0.1.11.
Because the field of structural variation is relatively new and complex, we viewed placing a dependency on all individual tools completing successfully as a requirement for a successful run to be too strict. In other words, if one of these tools fails while the others succeed, the app will output the results of the tools that completed and will not itself fail.
Breakseq2 may only be able to work when using the 1000 Genomes reference genome (hs37d5). For other reference genomes, you may not get Breakseq2 results.
For additional information, please see the following papers:
- Lumpy: Ryan M Layer, Colby Chiang, Aaron R Quinlan, and Ira M Hall. 2014. "LUMPY: a Probabilistic Framework for Structural Variant Discovery." Genome Biology 15 (6): R84. doi:10.1186/gb-2014-15-6-r84
- Manta: Chen X, Schulz-Trieglaff O, Shaw R, Barnes B, Schlesinger F, Källberg M, Cox AJ, Kruglyak S, Saunders CT. 2015 "Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications." Bioinformatics. doi: 10.1093/bioinformatics/btv710
- Breakdancer: Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP, Shi X, Fulton RS, Ley TJ, Wilson RK, Ding L, Mardis ER. 2009. "BreakDancer: an algorithm for high-resolution mapping of genomic structural variation". Nature Methods 6. doi:10.1038/nmeth.1363
- Breakseq2: Abyzov A, Li S, Kim DR, Mohiyuddin M, Stütz AM, Parrish NF, Mu XJ, Clark W, Chen K, Hurles M, Korbel JO, Lam HYK, Lee C, Gerstein MB. 2015. "Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms". Nature Communications 6. doi:10.1038/ncomms8256
- Delly2: Tobias Rausch, Thomas Zichner, Andreas Schlattl, Adrian M. Stuetz, Vladimir Benes, Jan O. Korbel. 2012. Delly: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28:i333-i339. doi: 10.1093/bioinformatics/bts378
- CNVnator: Abyzov A, Urban AE, Snyder M, Gerstein M. 2011. "CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing". Genome Research (6):974. doi: 10.1101/gr.114876.110
- Parliament: Adam C English, William J Salerno, Oliver A Hampton, Claudia Gonzaga-Jauregui, Shruthi Ambreth, Deborah I Ritter, Christine R Beck, Caleb F Davis, Mahmoud Dahdouli, Singer Ma, Andrew Carroll, Narayanan Veeraraghavan, Jeremy Bruestle, Becky Drees, Alex Hastie, Ernest T Lam, Simon White, Pamela Mishra, Min Wang, Yi Han, Feng Zhang, Pawel Stankiewicz, David A Wheeler, Jeffrey G Reid, Donna M Muzny, Jeffrey Rogers, Aniko Sabo, Kim C Worley, James R Lupski, Eric Boerwinkle and Richard A Gibbs. Assessing structural variation in a personal genome—towards a human reference diploid genome. BMC Genomics 2015, 16:286 doi:10.1186/s12864-015-1479-3.
usage: parliament2.py [-h] --bam BAM [--bai BAI] -r REF_GENOME [--fai FAI]
[--prefix PREFIX] [--filter_short_contigs]
[--breakdancer] [--breakseq] [--manta] [--cnvnator]
[--lumpy] [--delly_deletion] [--delly_insertion]
[--delly_inversion] [--delly_duplication] [--genotype]
[--svviz] [--svviz_only_validated_candidates]
Parliament2
optional arguments:
-h, --help show this help message and exit
--bam BAM The name of the Illumina BAM file for which to call
structural variants containing mapped reads.
--bai BAI (Optional) The name of the corresponding index for the
Illumina BAM file.
-r REF_GENOME, --ref_genome REF_GENOME
The name of the reference file that matches the
reference used to map the Illumina inputs.
--fai FAI (Optional) The name of the corresponding index for the
reference genome file.
--prefix PREFIX (Optional) If provided, all output files will start
with this. If absent, the base of the BAM file name
will be used.
--filter_short_contigs
If selected, SV calls will not be generated on contigs
shorter than 1 MB.
--breakdancer If selected, the program Breakdancer will be one of
the SV callers run.
--breakseq If selected, the program BreakSeq2 will be one of the
SV callers run.
--manta If selected, the program Manta will be one of the SV
callers run.
--cnvnator If selected, the program CNVnator will be one of the
SV callers run.
--lumpy If selected, the program Lumpy will be one of the SV
callers run.
--delly_deletion If selected, the deletion module of the program Delly2
will be one of the SV callers run.
--delly_insertion If selected, the insertion module of the program
Delly2 will be one of the SV callers run.
--delly_inversion If selected, the inversion module of the program
Delly2 will be one of the SV callers run.
--delly_duplication If selected, the duplication module of the program
Delly2 will be one of the SV callers run.
--genotype If selected, candidate events determined from the
individual callers will be genotyped and merged to
create a consensus output.
--svviz If selected, visualizations of genotyped SV events
will be produced with SVVIZ, one screenshot of support
per event. For this option to take effect, Genotype
must be selected.
--svviz_only_validated_candidates
Run SVVIZ only on validated candidates? For this
option to be relevant, SVVIZ must be selected. NOT
selecting this will make the SVVIZ component run
longer.