forked from etal/cnvkit
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
test: fix bit rot so unit tests pass cleanly
- remove the trailing "FORMAT" column from nosample.vcf - clarify a comment -- tabio reads no-sample VCFs as empty dataframes - convert a deprecated pandas slice-by-integer to its numpy equivalent
- Loading branch information
Showing
4 changed files
with
16 additions
and
261 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,140 +1,17 @@ | ||
##fileformat=VCFv4.2 | ||
##fileDate=20150729 | ||
##source=SNV-Unifier | ||
##reference=file:///home/dnanexus/genome.fa | ||
##source=custom | ||
##reference=genome.fa | ||
##phasing=none | ||
##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Somatic event"> | ||
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of samples with data"> | ||
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total read depth at the locus"> | ||
##INFO=<ID=DPB,Number=1,Type=Float,Description="Total read depth per bp at the locus; bases in reads overlapping / bases in haplotype"> | ||
##INFO=<ID=AC,Number=A,Type=Integer,Description="Total number of alternate alleles in called genotypes"> | ||
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes"> | ||
##INFO=<ID=AF,Number=A,Type=Float,Description="Estimated allele frequency in the range (0,1]"> | ||
##INFO=<ID=RO,Number=1,Type=Integer,Description="Reference allele observation count, with partial observations recorded fractionally"> | ||
##INFO=<ID=AO,Number=A,Type=Integer,Description="Alternate allele observations, with partial observations recorded fractionally"> | ||
##INFO=<ID=PRO,Number=1,Type=Float,Description="Reference allele observation count, with partial observations recorded fractionally"> | ||
##INFO=<ID=PAO,Number=A,Type=Float,Description="Alternate allele observations, with partial observations recorded fractionally"> | ||
##INFO=<ID=QR,Number=1,Type=Integer,Description="Reference allele quality sum in phred"> | ||
##INFO=<ID=QA,Number=A,Type=Integer,Description="Alternate allele quality sum in phred"> | ||
##INFO=<ID=PQR,Number=1,Type=Float,Description="Reference allele quality sum in phred for partial observations"> | ||
##INFO=<ID=PQA,Number=A,Type=Float,Description="Alternate allele quality sum in phred for partial observations"> | ||
##INFO=<ID=SRF,Number=1,Type=Integer,Description="Number of reference observations on the forward strand"> | ||
##INFO=<ID=SRR,Number=1,Type=Integer,Description="Number of reference observations on the reverse strand"> | ||
##INFO=<ID=SAF,Number=A,Type=Integer,Description="Number of alternate observations on the forward strand"> | ||
##INFO=<ID=SAR,Number=A,Type=Integer,Description="Number of alternate observations on the reverse strand"> | ||
##INFO=<ID=SRP,Number=1,Type=Float,Description="Strand balance probability for the reference allele: Phred-scaled upper-bounds estimate of the probability of observing the deviation between SRF and SRR given E(SRF/SRR) ~ 0.5, derived using Hoeffding's inequality"> | ||
##INFO=<ID=SAP,Number=A,Type=Float,Description="Strand balance probability for the alternate allele: Phred-scaled upper-bounds estimate of the probability of observing the deviation between SAF and SAR given E(SAF/SAR) ~ 0.5, derived using Hoeffding's inequality"> | ||
##INFO=<ID=AB,Number=A,Type=Float,Description="Allele balance at heterozygous sites: a number between 0 and 1 representing the ratio of reads showing the reference allele to all reads, considering only reads from individuals called as heterozygous"> | ||
##INFO=<ID=ABP,Number=A,Type=Float,Description="Allele balance probability at heterozygous sites: Phred-scaled upper-bounds estimate of the probability of observing the deviation between ABR and ABA given E(ABR/ABA) ~ 0.5, derived using Hoeffding's inequality"> | ||
##INFO=<ID=RUN,Number=A,Type=Integer,Description="Run length: the number of consecutive repeats of the alternate allele in the reference genome"> | ||
##INFO=<ID=RPP,Number=A,Type=Float,Description="Read Placement Probability: Phred-scaled upper-bounds estimate of the probability of observing the deviation between RPL and RPR given E(RPL/RPR) ~ 0.5, derived using Hoeffding's inequality"> | ||
##INFO=<ID=RPPR,Number=1,Type=Float,Description="Read Placement Probability for reference observations: Phred-scaled upper-bounds estimate of the probability of observing the deviation between RPL and RPR given E(RPL/RPR) ~ 0.5, derived using Hoeffding's inequality"> | ||
##INFO=<ID=RPL,Number=A,Type=Float,Description="Reads Placed Left: number of reads supporting the alternate balanced to the left (5') of the alternate allele"> | ||
##INFO=<ID=RPR,Number=A,Type=Float,Description="Reads Placed Right: number of reads supporting the alternate balanced to the right (3') of the alternate allele"> | ||
##INFO=<ID=EPP,Number=A,Type=Float,Description="End Placement Probability: Phred-scaled upper-bounds estimate of the probability of observing the deviation between EL and ER given E(EL/ER) ~ 0.5, derived using Hoeffding's inequality"> | ||
##INFO=<ID=EPPR,Number=1,Type=Float,Description="End Placement Probability for reference observations: Phred-scaled upper-bounds estimate of the probability of observing the deviation between EL and ER given E(EL/ER) ~ 0.5, derived using Hoeffding's inequality"> | ||
##INFO=<ID=DPRA,Number=A,Type=Float,Description="Alternate allele depth ratio. Ratio between depth in samples with each called alternate allele and those without."> | ||
##INFO=<ID=ODDS,Number=1,Type=Float,Description="The log odds ratio of the best genotype combination to the second-best."> | ||
##INFO=<ID=GTI,Number=1,Type=Integer,Description="Number of genotyping iterations required to reach convergence or bailout."> | ||
##INFO=<ID=TYPE,Number=A,Type=String,Description="The type of allele, either snp, mnp, ins, del, or complex."> | ||
##INFO=<ID=CIGAR,Number=A,Type=String,Description="The extended CIGAR representation of each alternate allele, with the exception that '=' is replaced by 'M' to ease VCF parsing. Note that INDEL alleles do not have the first matched base (which is provided by default, per the spec) referred to by the CIGAR."> | ||
##INFO=<ID=NUMALT,Number=1,Type=Integer,Description="Number of unique non-reference alleles in called genotypes at this position."> | ||
##INFO=<ID=MEANALT,Number=A,Type=Float,Description="Mean number of unique non-reference allele observations per sample with the corresponding alternate alleles."> | ||
##INFO=<ID=LEN,Number=A,Type=Integer,Description="allele length"> | ||
##INFO=<ID=MQM,Number=A,Type=Float,Description="Mean mapping quality of observed alternate alleles"> | ||
##INFO=<ID=MQMR,Number=1,Type=Float,Description="Mean mapping quality of observed reference alleles"> | ||
##INFO=<ID=PAIRED,Number=A,Type=Float,Description="Proportion of observed alternate alleles which are supported by properly paired read fragments"> | ||
##INFO=<ID=PAIREDR,Number=1,Type=Float,Description="Proportion of observed reference alleles which are supported by properly paired read fragments"> | ||
##INFO=<ID=technology.ILLUMINA,Number=A,Type=Float,Description="Fraction of observations supporting the alternate observed in reads from ILLUMINA"> | ||
##INFO=<ID=SPLITMULTIALLELIC,Number=0,Type=Flag,Description="Variant represents one alternate allele from a multiallelic variant call."> | ||
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP Membership"> | ||
##INFO=<ID=MQ0,Number=1,Type=Integer,Description="Total Mapping Quality Zero Reads"> | ||
##INFO=<ID=VT,Number=1,Type=String,Description="Variant type, can be SNP, MNP, INS or DEL"> | ||
##INFO=<ID=JOINED,Number=0,Type=Flag,Description="Variant is the result of joining adjacent SNPs/MNPs. Please use caution when interpreting sample-specific values."> | ||
##INFO=<ID=ALERT,Number=0,Type=Flag,Description="Variant is adjacent to other variants, but case was too complex for joinAdjacentSNPs to handle. Please review."> | ||
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record"> | ||
##INFO=<ID=HOMLEN,Number=1,Type=Integer,Description="Length of base pair identical micro-homology at event breakpoints"> | ||
##INFO=<ID=PF,Number=1,Type=Integer,Description="The number of samples carry the variant"> | ||
##INFO=<ID=HOMSEQ,Number=.,Type=String,Description="Sequence of base pair identical micro-homology at event breakpoints"> | ||
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles"> | ||
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant"> | ||
##INFO=<ID=NTLEN,Number=.,Type=Integer,Description="Number of bases inserted in place of deleted code"> | ||
##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities"> | ||
##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?"> | ||
##INFO=<ID=Dels,Number=1,Type=Float,Description="Fraction of Reads Containing Spanning Deletions"> | ||
##INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using Fisher's exact test to detect strand bias"> | ||
##INFO=<ID=HaplotypeScore,Number=1,Type=Float,Description="Consistency of the site with at most two segregating haplotypes"> | ||
##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation"> | ||
##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed"> | ||
##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed"> | ||
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total read depth at the locus"> | ||
##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality"> | ||
##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities"> | ||
##INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality by Depth"> | ||
##INFO=<ID=RPA,Number=.,Type=Integer,Description="Number of times tandem repeat unit is repeated, for each allele (including reference)"> | ||
##INFO=<ID=RU,Number=1,Type=String,Description="Tandem repeat unit (bases)"> | ||
##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias"> | ||
##INFO=<ID=SOR,Number=1,Type=Float,Description="Symmetric Odds Ratio of 2x2 contingency table to detect strand bias"> | ||
##INFO=<ID=STR,Number=0,Type=Flag,Description="Variant is a short tandem repeat"> | ||
##INFO=<ID=ClippingRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref number of hard clipped bases"> | ||
##INFO=<ID=SF,Number=.,Type=String,Description="Source File (index to sourceFiles, f when filtered)"> | ||
##INFO=<ID=ADERROR,Number=0,Type=Flag,Description="The AD field from the original variant caller VCF has an error at this locus. Please use caution when interpreting AD, GMIMAF, and GMICOV."> | ||
##FORMAT=<ID=GQ,Number=1,Type=Float,Description="Genotype Quality, the Phred-scaled marginal (or unconditional) probability of the called genotype"> | ||
##FORMAT=<ID=GL,Number=G,Type=Float,Description="Genotype Likelihood, log10-scaled likelihoods of the data given the called genotype for each possible genotype generated from the reference and alternate alleles given the sample ploidy"> | ||
##FORMAT=<ID=RO,Number=1,Type=Integer,Description="Reference allele observation count"> | ||
##FORMAT=<ID=QR,Number=1,Type=Integer,Description="Sum of quality of the reference observations"> | ||
##FORMAT=<ID=AO,Number=A,Type=Integer,Description="Alternate allele observation count"> | ||
##FORMAT=<ID=QA,Number=A,Type=Integer,Description="Sum of quality of the alternate observations"> | ||
##FORMAT=<ID=BQ,Number=A,Type=Float,Description="Average base quality for reads supporting alleles"> | ||
##FORMAT=<ID=FA,Number=A,Type=Float,Description="Allele fraction of the alternate allele with regard to reference"> | ||
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification"> | ||
##FORMAT=<ID=SS,Number=1,Type=Integer,Description="Variant status relative to non-adjacent Normal,0=wildtype,1=germline,2=somatic,3=LOH,4=post-transcriptional modification,5=unknown"> | ||
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Reference depth, how many reads support the reference"> | ||
##FORMAT=<ID=GMIMUT,Number=A,Type=Integer,Description="GMI-derived mutant read count: the number of reads observed at the variant locus that support each ALT allele."> | ||
##FORMAT=<ID=GMIMAF,Number=A,Type=Integer,Description="GMI-derived MAF: the proportion of reads observed at the variant locus that support each ALT allele."> | ||
##FORMAT=<ID=GMICOV,Number=1,Type=Integer,Description="GMI-derived coverage: total read depth at the variant locus."> | ||
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> | ||
##FORMAT=<ID=AD,Number=2,Type=Integer,Description="# of reads supporting consensus reference/indel at the site"> | ||
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Total coverage at the site"> | ||
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> | ||
##FORMAT=<ID=MM,Number=2,Type=Float,Description="Average # of mismatches per ref-/consensus indel-supporting read"> | ||
##FORMAT=<ID=MQS,Number=2,Type=Float,Description="Average mapping qualities of ref-/consensus indel-supporting reads"> | ||
##FORMAT=<ID=NQSBQ,Number=2,Type=Float,Description="Within NQS window: average quality of bases in ref-/consensus indel-supporting reads"> | ||
##FORMAT=<ID=NQSMM,Number=2,Type=Float,Description="Within NQS window: fraction of mismatching bases in ref/consensus indel-supporting reads"> | ||
##FORMAT=<ID=REnd,Number=2,Type=Integer,Description="Median/mad of indel offsets from the ends of the reads"> | ||
##FORMAT=<ID=RStart,Number=2,Type=Integer,Description="Median/mad of indel offsets from the starts of the reads"> | ||
##FORMAT=<ID=SC,Number=4,Type=Integer,Description="Strandness: counts of forward-/reverse-aligned reference and indel-supporting reads (FwdRef,RevRef,FwdIndel,RevIndel)"> | ||
##FORMAT=<ID=CallHC,Number=1,Type=Integer,Description="Variant was called by HaplotypeCaller"> | ||
##FORMAT=<ID=CallUG,Number=1,Type=Integer,Description="Variant was called by UnifiedGenotyper"> | ||
##FORMAT=<ID=CallFB,Number=1,Type=Integer,Description="Variant was called by freeBayes"> | ||
##FORMAT=<ID=CallPI,Number=1,Type=Integer,Description="Variant was called by pindel"> | ||
##FORMAT=<ID=CallSID,Number=1,Type=Integer,Description="Variant was called by SomaticIndelDetector"> | ||
##FORMAT=<ID=CallMU,Number=1,Type=Integer,Description="Variant was called by MuTect"> | ||
##FORMAT=<ID=LR,Number=1,Type=Float,Description="CNV log2 ratio"> | ||
##FORMAT=<ID=MMQ,Number=A,Type=Integer,Description="median mapping quality"> | ||
##FILTER=<ID=LowQual,Description="Low quality"> | ||
##FILTER=<ID=PASS,Description="All filters passed"> | ||
##FILTER=<ID=REJECT,Description="Not somatic due to normal call frequency or phred likelihoods: tumor: 35, normal 35."> | ||
##contig=<ID=chrM,length=16571> | ||
##contig=<ID=chr1,length=249250621> | ||
##contig=<ID=chr2,length=243199373> | ||
##contig=<ID=chr3,length=198022430> | ||
##contig=<ID=chr4,length=191154276> | ||
##contig=<ID=chr5,length=180915260> | ||
##contig=<ID=chr6,length=171115067> | ||
##contig=<ID=chr7,length=159138663> | ||
##contig=<ID=chr8,length=146364022> | ||
##contig=<ID=chr9,length=141213431> | ||
##contig=<ID=chr10,length=135534747> | ||
##contig=<ID=chr11,length=135006516> | ||
##contig=<ID=chr12,length=133851895> | ||
##contig=<ID=chr13,length=115169878> | ||
##contig=<ID=chr14,length=107349540> | ||
##contig=<ID=chr15,length=102531392> | ||
##contig=<ID=chr16,length=90354753> | ||
##contig=<ID=chr17,length=81195210> | ||
##contig=<ID=chr18,length=78077248> | ||
##contig=<ID=chr19,length=59128983> | ||
##contig=<ID=chr20,length=63025520> | ||
##contig=<ID=chr21,length=48129895> | ||
##contig=<ID=chr22,length=51304566> | ||
##contig=<ID=chrX,length=155270560> | ||
##contig=<ID=chrY,length=59373566> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Blank |
Oops, something went wrong.