Variant quality checking scripts for complex indel variant discovery and filtering from Pindel-C outputs. Referenced in Systematic discovery of complex insertions and deletions in human cancers (doi:10.1038/nm.4002).
Main QC script is run using bsub_qc.sh, which initiates the main qc_pipeline.sh. The input to bsub_qc.sh is described in the file.
#Steps
- Extract complex insertions and deletions from pindel output.
- Identify somatic, germline, and loss of heterozygosity(loh) events.
- Filter out low coverage sites (20 read min).
- Make unfiltered VCF for germline, somatic and loh events.
- Run readcount tool on tumor sample. Performing readcount analysis will determine if somatic and loh events are appropriately classified (Note: Not run for germline).
- Run readcount tool on normal sample. Performing readcount analysis will determine if somatic and loh events are appropriately classified (Note: Not run for germline).
- Reclassify germline, somatic, and loh based on read count data of somatic events.
- Making VCFs for filtered pindel output for VEP input & annotate final filtered VCF using VEP.
Reyka Jayasinghe ([email protected]) and Steven Foltz ([email protected]).