https://gitpod.io/#https://github.com/renatopuga/somatico
GATK 4 Mutect2 Somático
- GATK4 - Mutect2
- Gene JAK2
- Referência chr9
- Sobre as versões do Genoma Humano: https://gatk.broadinstitute.org/hc/en-us/articles/360035890711-GRCh37-hg19-b37-humanG1Kv37-Human-Reference-Discrepancies#grch37
- Amostras:
- WP043 (tumor)
- WP044 (normal)
- https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-
samtools view -h -b /Volumes/Seagate\ Expansion\ Drive/data-lpfap10/projects/proadi/exome/bam/WP043/WP043.bam 9:5021937-5126899 | samtools bam2fq -1 tumor_R1.fq -2 tumor_R2.fq -
samtools view -h -b /Volumes/Seagate\ Expansion\ Drive/data-lpfap10/projects/proadi/exome/bam/WP044/WP044.bam 9:5021937-5126899 | samtools bam2fq -1 normal_R1.fq -2 normal_R2.fq -
- BAM para BAM
samtools view -h -b /Volumes/Seagate\ Expansion\ Drive/data-lpfap10/projects/proadi/exome/bam/WP043/WP043.bam 9:5021937-5126899 > tumor_JAK2.bam
samtools view -h -b /Volumes/Seagate\ Expansion\ Drive/data-lpfap10/projects/proadi/exome/bam/WP044/WP044.bam 9:5021937-5126899 > normal_JAK2.bam
- Gerar index do BAM (.BAI)
samtools index tumor_JAK2.bam
samtools index normal_JAK2.bam
- Header do VCF
zgrep -w "\#" af-only-gnomad.raw.sites.chr.vcf.gz > header
- Apenas Região do Gene JAK2
zgrep -w "^chr9" af-only-gnomad.raw.sites.chr.vcf.gz | awk '$2>=5021937 && $2<=5126899' > JAK2.region
- Concatenar header + vcf
cat header JAK2.region > af-only-gnomad-chr9.vcf
- Compactar
bgzip af-only-gnomad-chr9.vcf
- Index do VCF
tabix -p vcf af-only-gnomad-chr9.vcf.gz
- Download
wget -c https://hgdownload.soe.ucsc.edu/goldenPath/hg19/chromosomes/chr9.fa.gz
- Descompactar
gunzip chr9.fa.gz
Algoritmo de alinhamento.
- BWA install (Mac)
brew install bwa
- BWA install (Ubuntu)
sudo apt-get install bwa
- BWA install (Docker)
docker pull comics/bwa
- Index chr9
bwa index chr9.fa
Samtools is a suite of programs for interacting with high-throughput sequencing data. It consists of three separate repositories:
- samtools install (Mac)
brew install samtools
- samtools install (Ubuntu)
sudo apt-get install samtools
- samtools install (Docker)
docker pull biocontainers/samtools
- samtools faidx
samtools faidx chr9.fa
- samtools index
samtools index tumor_JAK2.bam
samtools index normal_JAK2.bam
Version: 4.2.2.0
Genome Analysis Toolkit - Variant Discovery in High-Throughput Sequencing Data. https://gatk.broadinstitute.org/
GATK4 install Docker
docker pull broadinstitute/gatk:4.2.2.0
GATK4 install Mac e Linux
- Download
wget -c https://github.com/broadinstitute/gatk/releases/download/4.2.2.0/gatk-4.2.2.0.zip
- Descompactar
unzip gatk-4.2.2.0.zip
- Testando gatk
./gatk-4.2.2.0/gatk
./gatk-4.2.2.0/gatk CreateSequenceDictionary -R chr9.fa -O chr9.dict
./gatk-4.2.2.0/gatk ScatterIntervalsByNs -R chr9.fa -O chr9.interval_list -OT ACGT
Call somatic SNVs and indels via local assembly of haplotypes
./gatk-4.2.2.0/gatk Mutect2 -R chr9.fa -I tumor_JAK2.bam -I normal_JAK2.bam -normal WP044 --germline-resource af-only-gnomad-chr9.vcf.gz -O somatic.vcf.gz -L chr9.interval_list
Tabulates pileup metrics for inferring contamination
- GetPileupSummaries Tumor
./gatk-4.2.2.0/gatk GetPileupSummaries -I tumor_JAK2.bam -V af-only-gnomad-chr9.vcf.gz -L chr9.interval_list -O tumor_JAK2.table
- GetPileupSummaries Normal
./gatk-4.2.2.0/gatk GetPileupSummaries -I normal_JAK2.bam -V af-only-gnomad-chr9.vcf.gz -L chr9.interval_list -O normal_JAK2.table
Calculate the fraction of reads coming from cross-sample contamination
./gatk-4.2.2.0/gatk CalculateContamination -I tumor_JAK2.table -matched normal_JAK2.table -O contamination.table
Filter somatic SNVs and indels called by Mutect2
./gatk-4.2.2.0/gatk FilterMutectCalls -R chr9.fa -V somatic.vcf.gz --contamination-table contamination.table -O filtered.vcf.gz
Functional Annotator
-
30G Source Somatic (1s)
- ftp://[email protected]/bundle/funcotator/funcotator_dataSources.v1.7.20200521s.tar.gz
-
Download Funcotator
wget -c ftp://[email protected]/bundle/funcotator/funcotator_dataSources.v1.7.20200521s.tar.gz
- Descompactar
tar -zxvf funcotator_dataSources.v1.7.20200521s.tar.gz
- Anotar
./gatk-4.2.2.0/gatk Funcotator --data-sources-path funcotator_dataSources.v1.7.20200521s -O funcotator.maf --output-file-format MAF --ref-version hg19 --reference chr9.fa --variant filtered.vcf.gz