Call regions of homozygosity and make tentative UPD calls
Usage: rhocall [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
aggregate Aggregate runs of autozygosity from rhofile...
annotate Markup VCF file using rho-calls.
call Call runs of autozygosity.
tally Tally runs of autozygosity from rhofile.
viz Plot binned zygosity and RHO-regions.
Usage: rhocall call [OPTIONS] VCF
Call runs of autozygosity.
Options:
-m, --max_hets FLOAT Max heterozygotes per Mb in a homozygous
block
-f, --max_het_fraction FLOAT Max heterozygotes over homozygotes fraction
in a homozygous block
-n, --minimum_homs INTEGER Minimum absolute number of homozygotes to
report a block
-s, --shortest_block INTEGER Shortest block
-u, --flag_upd_at_fraction FLOAT
Flag UPD if homozygous blocks span this
fraction of total chr size
-k, --individual INTEGER Index of individual in vcf/bcf, 0-based.
-s, --block_constant INTEGER Should be adapted to type of analysis(exome
or genome)
-v, --verbose
--help Show this message and exit.
Usage: rhocall aggregate [OPTIONS] ROH
Aggregate runs of autozygosity from rhofile into windowed rho BED file.
Accepts a bcftools roh style TSV-file with CHR,POS,AZ,QUAL.
Options:
-q, --quality_threshold FLOAT Minimum quality trusted to start or end ROH-
windows.
-v, --verbose
-o, --output FILENAME
--help Show this message and exit.
Usage: rhocall tally [OPTIONS] ROH
Tally runs of autozygosity from rhofile. Accepts a bcftools roh style TSV-
file with CHR,POS,AZ,QUAL.
Options:
-q, --quality_threshold FLOAT Minimum quality that counts towards region
totals.
-u, --flag_upd_at_fraction FLOAT
Flag UPD if this fraction of chr quality
positions called AZ.
-v, --verbose
-o, --output FILENAME
--help Show this message and exit.
Usage: rhocall annotate [OPTIONS] VCF
Markup VCF file using rho-calls. Use BED file to mark all variants in AZ
windows. Alternatively, use a bcftools v>=1.4 file with RG entries to mark
all vars. With the --no-v14 flag, use an older bcftools v<=1.2 style roh
TSV to mark only selected AZ variants. Roh is broken in bcftools v1.3 -
do not use.
Options:
-r FILENAME Bcftools roh style TSV file with
CHR,POS,AZ,QUAL.
--v14 / --no-v14 Bcftools v1.4 or newer roh file including RG
calls.
-b FILENAME BED file with AZ windows.
-q, --quality_threshold FLOAT Minimum quality calls that are imported in
region totals.
-u, --flag_upd_at_fraction FLOAT
Flag UPD if this fraction of chr quality
positions called AZ.
-v, --verbose
-o, --output FILENAME
--help Show this message and exit.
Plot binned zygosity and RHO-regions.
Options:
--out_dir PATH Output directory. The files will be named
out_dir/chr.png. One picture is drawn per
chromosome. [required]
--wig / --no-wig Produce wig file.
-p, --pointsize INTEGER Size of the points (pixels)
-r, --rho FILENAME Input RHO file produced from rhocall [required]
-m, --minsnv INTEGER Minimum number of snvs for each plotted bin
-M, --maxsnv INTEGER Maximum number of snvs for each plotted bin
--minaf FLOAT Minimum allele frequency. This variable must be
set to 0 if the allele frequency is not
annotated.
--maxaf FLOAT Maximum allele frequency
--aftag TEXT The allele frequency tag to use.
-q, --minqual INTEGER Do not add SNVs to a bin if their quality is
less than this value.
--mnv / --no-mnv Include MNV
-w, --window INTEGER Window size(bases)
-s, --rsid / --no-rsid Skip variants not containing an rsid
-n, --filter / --no-filter include variants, even if they are not labeled
PASS
-v, --verbose
--help Show this message and exit.
Usage: rhoviz [OPTIONS] -i input.vcf -r rho.tab -d output_dir
Visualise the rhocall output file. Genomic regions labeled RHO are visualised as red lines.
Additionally, the fraction of homozygous snps are visualised as black dots.
Options:
-r FILENAME Input RHO file produced from rhocall
--help Show this message and exit.
-i Input vcf file
-d output directory, the files will be named dir/chr.png,
-w window size(bases) (default = 10 000)
-m minimum number of snvs for each plotted bin (default =2)
-M maximum number of snvs for each plotted bin (default = 20)
--minaf minimum allele frequency (default = 0.1)
(this variable must be set to 0 if the allele frequency is not annotated)
--maxaf maximum allele frequency (default = 0.9)
--aftag AFTAG the allele frequency tag (default = 1000GAF)
-q do not add snvs to a bin if there quality is less than this value (default = 60)
-p Size of the points (pixels)
-n include variants, even if they are not labeled PASS
bcftools query -f'%CHROM\t%POS\t%REF,%ALT\t%INFO/AF\n' popfreq.vcf.gz | bgzip -c > popfreq.tab.gz
Please see the samtools project for installation instructions, and please refer to Narasimhan et al, 2016 regarding method details.
bcftools roh --AF-file popfreq.tab.gz -I sample.bcf > sample.roh
rhocall annotate --v14 -r sample.roh -o sample.14.rho.vcf sample.bcf
rhocall aggregate sample.roh -o sample.roh.bed
rhocall annotate -b sample.roh.bed -o sample.rho.vcf sample.bcf
rhocall tally sample.roh -o sample.roh.tally.tsv
rhocall viz --rho sample.roh --wig --out_dir rhocall sample.vcf
bcftools query -f'%CHROM\t%POS\t%REF,%ALT\t%INFO/AF\n' anon-SweGen_STR_NSPHS_1000samples_snp_freq_hg19.vcf.gz | bgzip -c > anon_SweGen_161019_snp_freq_hg19.tab.gz
bcftools roh --AF-file anon_SweGen_161019_snp_freq_hg19.tab.gz -I 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.bcf > 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh
# bcftools <=1.2
rhocall tally 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh -o 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh.tally.tsv
rhocall annotate --no-v14 -r 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.bcf -o 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh.vcf
rhocall aggregate 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh -o 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh.bed
rhocall annotate -b 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.roh.bed -o 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.rho.vcf 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.bcf
# bcftools >=1.4
rhocall annotate --v14 -r 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.14.roh -o 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.14.rho.vcf 2016-14676_sorted_md_rreal_brecal_gvcf_vrecal_comb_BOTH.bcf
# visualize: (using bcftools v1.9)
bcftools roh --AF-file /home/proj/development/rare-disease/references/grch37_anon_swegen_snp_-2016-10-19-.tab.gz -I F0010931_sorted_md_brecal_haptc_vrecal_comb_BOTH.bcf > F0010931_sorted_md_brecal_haptc_vrecal_comb_BOTH.roh
rhocall viz --wig --out_dir rhocall --aftag GNOMADAF --rho F0010931_sorted_md_brecal_haptc_vrecal_comb_BOTH.roh F0010931_sorted_md_brecal_haptc_vrecal_comb_rhocall_vt_frqf_vep_parsed_snpeff_ranked_BOTH.vcf
The test directory contains test files from the BCFtools/RoH project.
The cyvcf2 install process appears to be jinxed on certain systems/setups. In practice this means that a chained pip install on a naive system may fail. Installation of each requirement for cyvcf2 prior to installing it appears to work unconditionally.
pip install numpy; pip install Cython
pip install -r requirements.txt
pip install -e .