Here we demonstrate a denovo assembly for an empirical RAD data set to give a general idea of the results you might expect to recover. This example was run on a 20-core workstation with 64GB RAM, and takes about 20 minutes to run completely.
We will use the 13 taxa Pedicularis data set from Eaton and Ree (2013) (open access link). This data set is composed of single-end 75bp reads from a RAD-seq library prepared with the PstI enzyme. This data set also serves as an example for several of our analysis cookbooks that demonstrate methods for analyzing RAD-seq results. So after you finish this assembly head over there to check out fun ways to analyze the data.
These data are archived on the NCBI sequence read archive (SRA) under
accession id SRP021469. We've written a convenient wrapper for the tool
sra-tools that allows ipyrad to download data from SRA, SRP, ERA, etc.,
IDs really easily. (For more information on this see
this post ).
Run the code below to download and decompress the fastq data files,
which will save them into a directory called example_empirical_data/
,
or whatever you wish to name it. The directory will be created if it doesn't
already exist. The compressed file size is approximately 1.1GB.
## first we need to download two additional tools
>>> conda install -c bioconda sra-tools entrez-direct
## then, download the fastq data from the SRA database
>>> ipyrad --download SRP021469 example_empirical_data/
Always start by using the -n {name}
argument to create a new named Assembly.
I'll use the name pedicularis
to indicate taxa being assembled.
>>> ipyrad -n pedicularis
This will print the message:
New file 'params-pedicularis.txt' created in /home/deren/Documents/ipyrad/tests
In this case, the data come to us already demultiplexed so we are going to simply set the sorted_fastq_path to tell ipyrad the location of our data files. You can select multiple files at once using regular expressions, in this example we use an asterisk (*.gz) to select all files in the directory ending in .gz. We also set a project_dir, which is useful for grouping all our results into a single directory. For this we'll use name the project directory "analysis-ipyrad". If this folder doesn't exist then ipyrad will create it. Take note when entering the values below into your params file that they properly correspond to parameters 1 and 4, respectively.
## Use your text editor to enter the following values: ## The wildcard (*) tells ipyrad to select all files ending in .gz analysis-ipyrad ## [1] [project_dir] ... example_empirical_data/*.gz ## [4] [sorted_fastq_path] ...
We'll add a few additional options as well to: filter for adapters (param 16); trim the 3' edge of R1 aligned loci by 5bp (param 26; this is optional, but helps to remove poorly aligned 3' edges); and produce all output formats (param 27).
## enter the following params as well 2 ## [16] [filter_adapters] ... 0, 5, 0, 0 ## [26] [trim_loci] ... * ## [27] [output_formats] ...
We'll leave the remaining parameters at their default values.
Start an ipyrad assembly by running step 1. When the data location is entered
as a sorted_fastq_path (param 4), as opposed to the raw_fastq_path
(param 2), step 1 simply counts the number of reads for each Sample and
parses the file names to extract names for each Sample. For example, the
file 29154_superba.fastq.gz
will be assigned to Sample 29154_superba
.
We use the -s argument followed by 1 to tell ipyrad to run step 1. We also
pass it the -r argument so that it will print a results summary when finished.
>>> ipyrad -p params-pedicularis.txt -s 1 -r
------------------------------------------------------------- ipyrad [v.0.7.28] Interactive assembly and analysis of RAD-seq data ------------------------------------------------------------- New Assembly: pedicularis host compute node: [20 cores] on tinus Step 1: Loading sorted fastq data to Samples [####################] 100% loading reads | 0:00:11 13 fastq files loaded to 13 Samples. Summary stats of Assembly pedicularis ------------------------------------------------ state reads_raw 29154_superba 1 696994 30556_thamno 1 1452316 30686_cyathophylla 1 1253109 32082_przewalskii 1 964244 33413_thamno 1 636625 33588_przewalskii 1 1002923 35236_rex 1 1803858 35855_rex 1 1409843 38362_rex 1 1391175 39618_rex 1 822263 40578_rex 1 1707942 41478_cyathophylloides 1 2199740 41954_cyathophylloides 1 2199613
Because the point of this tutorial is to demonstrate run times and statistics, I will leave the rest of the parameters at their defaults and simply run all remaining steps. Further below I will explain in more detail the stats files for each step and what the values mean. To fully assemble this data set on a 4-core laptop takes about 2.25 hours. The example here was run on a 20-core workstation and can finish in ~20 minutes.
## run steps 2-7
>>> ipyrad -p params-pedicularis.txt -s 234567 -r
------------------------------------------------------------- ipyrad [v.0.7.28] Interactive assembly and analysis of RAD-seq data ------------------------------------------------------------- loading Assembly: pedicularis from saved path: ~/Documents/ipyrad/tests/analysis-ipyrad/pedicularis.json host compute node: [20 cores] on tinus Step 2: Filtering reads [####################] 100% processing reads | 0:01:21 Step 3: Clustering/Mapping reads [####################] 100% dereplicating | 0:00:09 [####################] 100% clustering | 0:05:02 [####################] 100% building clusters | 0:00:30 [####################] 100% chunking | 0:00:05 [####################] 100% aligning | 0:03:27 [####################] 100% concatenating | 0:00:17 Step 4: Joint estimation of error rate and heterozygosity [####################] 100% inferring [H, E] | 0:01:17 Step 5: Consensus base calling Mean error [0.00283 sd=0.00081] Mean hetero [0.01563 sd=0.00238] [####################] 100% calculating depths | 0:00:05 [####################] 100% chunking clusters | 0:00:07 [####################] 100% consens calling | 0:03:12 Step 6: Clustering at 0.85 similarity across 13 samples [####################] 100% concat/shuffle input | 0:00:06 [####################] 100% clustering across | 0:03:16 [####################] 100% building clusters | 0:00:06 [####################] 100% aligning clusters | 0:01:14 [####################] 100% database indels | 0:00:15 [####################] 100% indexing clusters | 0:00:09 [####################] 100% building database | 0:00:30 Step 7: Filter and write output files for 13 Samples [####################] 100% filtering loci | 0:00:06 [####################] 100% building loci/stats | 0:00:01 [####################] 100% building vcf file | 0:00:08 [####################] 100% writing vcf file | 0:00:00 [####################] 100% building arrays | 0:00:04 [####################] 100% writing outfiles | 0:01:48 Outfiles written to: ~/Documents/ipyrad/tests/analysis-ipyrad/pedicularis_outfiles Summary stats of Assembly pedicularis ------------------------------------------------ state reads_raw reads_passed_filter clusters_total 29154_superba 6 696994 689996 130735 30556_thamno 6 1452316 1440314 199587 30686_cyathophylla 6 1253109 1206947 233183 32082_przewalskii 6 964244 955480 146566 33413_thamno 6 636625 626084 169514 33588_przewalskii 6 1002923 993873 153089 35236_rex 6 1803858 1787366 410136 35855_rex 6 1409843 1397068 169357 38362_rex 6 1391175 1379626 128389 39618_rex 6 822263 813990 142844 40578_rex 6 1707942 1695523 215721 41478_cyathophylloides 6 2199740 2185364 166229 41954_cyathophylloides 6 2199613 2176210 293120 clusters_hidepth hetero_est error_est reads_consens 29154_superba 34539 0.015084 0.002612 32913 30556_thamno 51736 0.016421 0.003716 48957 30686_cyathophylla 53357 0.014842 0.003001 50649 32082_przewalskii 41518 0.018446 0.002874 39315 33413_thamno 30913 0.017537 0.002662 29417 33588_przewalskii 45282 0.018394 0.002772 42987 35236_rex 53678 0.015655 0.001939 51485 35855_rex 55421 0.019357 0.003986 52107 38362_rex 51863 0.012369 0.002065 49989 39618_rex 43044 0.014691 0.002916 41122 40578_rex 55350 0.015747 0.002098 53177 41478_cyathophylloides 53965 0.012430 0.001714 51816 41954_cyathophylloides 73857 0.012264 0.004415 70662 Full stats files ------------------------------------------------ step 1: ./analysis-ipyrad/pedicularis_s1_demultiplex_stats.txt step 2: ./analysis-ipyrad/pedicularis_edits/s2_rawedit_stats.txt step 3: ./analysis-ipyrad/pedicularis_clust_0.85/s3_cluster_stats.txt step 4: ./analysis-ipyrad/pedicularis_clust_0.85/s4_joint_estimate.txt step 5: ./analysis-ipyrad/pedicularis_consens/s5_consens_stats.txt step 6: ./analysis-ipyrad/pedicularis_consens/s6_cluster_stats.txt step 7: ./analysis-ipyrad/pedicularis_outfiles/pedicularis_stats.txt
Each assembly that finishes step 7 will create a stats.txt output summary in the 'assembly_name'_outfiles/ directory. This includes information about which filters removed data from the assembly, how many loci were recovered per sample, how many samples had data for each locus, and how many variable sites are in the assembled data.
cat ./analysis-ipyrad/pedicularis_outfiles/pedicularis_stats.txt
## The number of loci caught by each filter. ## ipyrad API location: [assembly].statsfiles.s7_filters total_filters applied_order retained_loci total_prefiltered_loci 88341 0 88341 filtered_by_rm_duplicates 2566 2566 85775 filtered_by_max_indels 518 518 85257 filtered_by_max_snps 212 121 85136 filtered_by_max_shared_het 946 908 84228 filtered_by_min_sample 39170 38942 45286 filtered_by_max_alleles 10196 5101 40185 total_filtered_loci 40185 0 40185 ## The number of loci recovered for each Sample. ## ipyrad API location: [assembly].stats_dfs.s7_samples sample_coverage 29154_superba 20755 30556_thamno 30996 30686_cyathophylla 26288 32082_przewalskii 14496 33413_thamno 18214 33588_przewalskii 16846 35236_rex 32353 35855_rex 32397 38362_rex 32795 39618_rex 27194 40578_rex 33154 41478_cyathophylloides 30667 41954_cyathophylloides 27961 ## The number of loci for which N taxa have data. ## ipyrad API location: [assembly].stats_dfs.s7_loci locus_coverage sum_coverage 1 0 0 2 0 0 3 0 0 4 5136 5136 5 3702 8838 6 3311 12149 7 2942 15091 8 3028 18119 9 4014 22133 10 4904 27037 11 5486 32523 12 4740 37263 13 2922 40185 ## The distribution of SNPs (var and pis) per locus. ## var = Number of loci with n variable sites (pis + autapomorphies) ## pis = Number of loci with n parsimony informative site (minor allele in >1 sample) ## ipyrad API location: [assembly].stats_dfs.s7_snps var sum_var pis sum_pis 0 2107 0 10483 0 1 3878 3878 9695 9695 2 5048 13974 7088 23871 3 5365 30069 4765 38166 4 4921 49753 3084 50502 5 4330 71403 1960 60302 6 3532 92595 1260 67862 7 2975 113420 819 73595 8 2253 131444 489 77507 9 1743 147131 270 79937 10 1331 160441 144 81377 11 948 170869 73 82180 12 665 178849 26 82492 13 388 183893 19 82739 14 271 187687 9 82865 15 178 190357 0 82865 16 111 192133 1 82881 17 65 193238 0 82881 18 39 193940 0 82881 19 27 194453 0 82881 20 10 194653 0 82881
This is the first place I look when an assembly finishes. It provides a clean view of the data with variable sites (-) and parsimony informative SNPs (*) highlighted. Use the unix commands less or head to look at this file briefly. Each locus is labelled with a number corresponding to the locus order before filters are applied in step 7. If you branch this assembly and run step 7 again with a different set of parameters you may recover fewer or more total loci.
## head -n 50 prints just the first 50 lines of the file to stdout
head -n 50 analysis-ipyrad/pedicularis_outfiles/pedicularis.loci
29154_superba TCTGGTCCCGCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTTTCGATCTCAGGCG
30556_thamno TCCGGTCCCGCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTCTAGATCTCAGGCG
30686_cyathophylla TCCAGTCCCGCGGGTGATCAAGGCCCCACCACCGCATCTCACATTCTCGATCTCAGGCG
33413_thamno TCCGGTCCTTCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTCTAGATCTCAGGCG
35236_rex TCCGGTCCCGCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTCTMGATCTCAGGCG
35855_rex TCCGGTCCCGCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTCTAGATCTCAGGCG
38362_rex TCCGGTCCTTCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTCTAGATCTCAGGCG
40578_rex TCCGGTCCYKCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTCTCGATCTCAGGCG
41478_cyathophylloides TCCGGTCCCGCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTATCGATCTCAGGCG
41954_cyathophylloides TCCGGTCCCGCGGGTGATCAAGGCCCCACCACCGCGTCTCACATTATCGATCTCAGGCG
// -- ** - * * |1|
29154_superba TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTAGTATTGTGAAATATATGCTTAAA
30556_thamno TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTTAAA
30686_cyathophylla TAAAAGCGAGTCACATCTAATGATCTANAATCTGTGGTATTGTGAAATATATGCTTAAA
33413_thamno TAAAAGCAAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTTAAA
35236_rex TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTTAAA
35855_rex TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTTAAA
38362_rex TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTCAAA
39618_rex TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTCAAA
40578_rex TAAAAGCGAGTCACATCTAATGATCTAAAMTCTGTGGTATTGTGAAATATATGCTTAAA
41478_cyathophylloides TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTTAAA
41954_cyathophylloides TAAAAGCGAGTCACATCTAATGATCTAAAATCTGTGGTATTGTGAAATATATGCTTAAA
// - - - * |3|
29154_superba AATGGGTTGTTCCATGGATAACAACTCCGTTTTATRCCAAATACTGTGACACGCACRCA
32082_przewalskii AATGGGTTGTTCCATGGTTAACAACTCCGTTTTATGCCAACTACTGCGACACACACGCA
33588_przewalskii AATGGGTTGTTCCATGGTTAACAACTCCGTTTTATGCCAACTACTGCGACACGCACGCA
41478_cyathophylloides AATGGGTTGTTCCATGGATAACAACTCCGTTTTATGCCAAATACTGTGACACGCACGCA
41954_cyathophylloides AATGGGTTGTTCCATGGATAACAACTCCGTTTTATGCCAAATACTGTGACACGCACGCA
// * - * * - - |5|
29154_superba AGCCGATTCGGTCGCGAGCAGCGATATTTTGTTTCCCCTCAAAATCTTCACAATCTCTA
30686_cyathophylla AGCCGATTTGGTTGCGAGCAGCGATATTTTGTTTCCCCTCAAAATCTTCACAATCTCCG
35236_rex AGCYGATTTGGTCGCGAGCAGCGATGTTTTGTTTCCCCTCAAAATCTTCATAATCTCTA
38362_rex AGCYGATTTGGTYGCGAGCAGCGATRTTTTGYTTCCCCTCAAAATCTTCAYAATCTCYR
41478_cyathophylloides AGCCGATTTGGTTGCGAGCAGCGATATTTTGTTTCCCCTCAAAATCTTCACAATCTCCA
41954_cyathophylloides AGCCGATTTGGTTGCGAGCAGCGATATTTTGTTTCCCCTCAAAATCTTCACAATCTCCA
// * - * * - * **|7|
29154_superba TCGACGCCATGTATGACTGTTCAAAATATCAAATGTACT-ATTACNACCACCCTTTTTT
30686_cyathophylla TCGACGCCATGTATGACTGTTCAAAATATCAAATGTACTAATTACCACCACCCTTTTTT
38362_rex TCGACGCCATGTATGACTGTTCAAAATATCAAATGTACT-ATTACCACCACCCTTTTTT
40578_rex TCGACGCCATNTATGACTGTTCAAAATATCAAACGTACT-ATTACCACCACCCTTTTTT
// - |14|
29154_superba ATCGATCATTTCGCCTCACAGTTGCTGGGTGCAGAAAAANNTCTTCATCTGATTCAGGT
30556_thamno ATCGATCATTTCTTCTCACAGTTGCTGGGTGCAGAAAAAATTCTTCATCTGATTCAGGT
30686_cyathophylla ATCGATCATTTCGCCTCACAGTTGCTGGGTGCAGAAAAAATTCTTCATCTGATTCAGGT
32082_przewalskii ATCGATCATTTCGCCTCACAGTTGCTGGATGCAGAAAAAATTCTTCATCTGATTCAGGT
33413_thamno ATCGATCATTTCTNCTCACAGTTGCTGGGTNCAGAAAA---------------------
33588_przewalskii ATCGATCATTTCGCCTCACAGTTGCTGGATGCAGAAAAAATTCTTCATCTGATTCAGGT
35236_rex ATCGATCATTTCTCCTCACAGTTGCTGGGTGCAGAAAAAATTCTTCATCTGATTCAGGT
35855_rex ATCGATCATTTCTCCTCACAGTTGCTGGGTGCAGAAAAAATTCTTCATCTGATTCAGGT
38362_rex ATCGATCATTTCTCCTCACAGTTGCTGGGTGCAAAAAAAATTCTTCATCTGATTCAGGT
39618_rex ATCGATCATTTCTCCTCACAGTTGCTGGGTGCAAAAAAAATTCTTCATCTGATTCAGGT
40578_rex ATCGATCATTTCTCCTCACAGTTGCTGGGTGCAGAAAAAATTCTTCATCTGATTCAGGT
41478_cyathophylloides ATCGATCATTTCGCCTCACAGTTGCTGGGTGCAGAAAAAATTCTTCATCTGATTCAGGT
41954_cyathophylloides ATCGATCATTTCGCCTCACAGTTGCTGGGTGCAGAAAAAATTCTTCATCTGATTCAGGT
// *- * * |16|
This is the concatenated sequence file of all loci in the data set. It is typically used in phylogenetic analyses, like in the program raxml. This super matrix is 13 taxon deep by 2.44 Mbp long.
## cut -c 1-80 prints only the first 80 characters of the file
cut -c 1-80 analysis-ipyrad/pedicularis_outfiles/pedicularis.phy
13 2577585 29154_superba AATGATGGTGGTACACATATTAATTACAATTTGGACAACGGCGGCTTTGTTCA 30556_thamno ACAGATGGTGGTACACATGTCAATTACAATTTGGATAACGGCGGNNNNNNNNN 30686_cyathophylla AATGATGGTGGTACACATATTAATTACAATTTGGACAACGGCGGCTTTGTTCA 32082_przewalskii NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 33413_thamno AGTGATGGTGGTACACATGTCNANTACAATTTGGACAACGGCGGCTTTGTTCN 33588_przewalskii NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 35236_rex NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 35855_rex NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 38362_rex NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 39618_rex NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 40578_rex AATGATGGTGGTACACATATYAATTACAAYTTGGAYAACGGCGGCTTTGTTCA 41478_cyathophylloides NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 41954_cyathophylloides NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
This is similar to the phylip file format, but only variable site columns are included. All SNPs are in the file, in contrast to the .u.snps.phy file, which randomly selects only a single SNP per locus.
## cut -c 1-80 prints only the first 80 characters of the file
cut -c 1-80 analysis-ipyrad/pedicularis_outfiles/pedicularis.snps.phy
13 194653 29154_superba ATATTCAAACTATTCAAAGTAACTGATGAAAYCTAGGGGAKCAGTTCGCGTGC 30556_thamno CAGCTTAAATTATNNGGCGCAACCGGAGAAANNNNNNNNNGAAGGTTACATNN 30686_cyathophylla ATATTCAAACTATAANNNNNAACTGATGAAACTTGTCGGNGCAGGTTACATGC 32082_przewalskii NNNNNNNNNNNNNACNNNNNCANNNNNNNNNNNNNNNNNCNNNNGTCGCGTNN 33413_thamno GTGCTCAAATTAANNNNNNNAGCCGAAGAAACCCGGCATNGCAKGTTANANNN 33588_przewalskii NNNNNNNNNNNNNACNNNNNAANNNNNNNNNNNNNNNNNCNNNNGTCGCGYNN 35236_rex NNNNNNAAATTGTNNGGCGTAACCAAAGAAANNNNNNNNNGCAGGTTAAATNN 35855_rex NNNNNNACATTATNNNNNNNAATCGAAGAAANNNNNNNNNNNNNGTTACATGC 38362_rex NNNNNNGAATTATNNNNNNNAACCAAAGAAACCCG-CCGNNNNNGTTACATNN 39618_rex NNNNNNGAATTATNNNNNNNAACCAAAGAAACCCG-CCGNNNNNGKTACATNN 40578_rex ATAYYYRAATYATNNGGCKTAACCGAAGAGGNNNNNNNNNNNNNGTTACATRC 41478_cyathophylloides NNNNNNATTTTATACNNNNNAACTGATTGAACCTAGGGGAGCGGGTTACATGT 41954_cyathophylloides NNNNNNATTTTATACNNNNNAANNNNNNNNNCCTAGGGGAGCGGGTTACATGT
The VCF output for ipyrad contains the full sequence information for all samples as well as the sequencing depth information for all base calls that were made. This file should be easily parsable if users want to extract information or modify it so that this file can be used in other software such as GATK. We are working on developing our own population-aware genotype caller that will correct low-depth base calls at this stage. Stay tuned.
## gunzip -c decompresses the file and passes it to the pipe (|)
## head -n 50 reads data from the pipe and show the first 50 lines.
## and we pipe this to 'cut', which shows only the first 80 rows of data
## for easier viewing.
head -n 50 analysis-ipyrad/pedicularis_outfiles/pedicularis.vcf | cut -c 1-80
##fileformat=VCFv4.0 ##fileDate=2017/02/14 ##source=ipyrad_v.0.7.28 ##reference=pseudo-reference (most common base at site) ##phasing=unphased ##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Dat ##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth"> ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth"> ##FORMAT=<ID=CATG,Number=1,Type=String,Description="Base Counts (CATG)"> #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 29154_superba 30556_thamno 30 locus_1 2 . A C,G 13 PASS NS=5;DP=49 GT:DP:CATG 0/0:9:0,9,0,0 1/1:7:7,0,0,0 locus_1 3 . T A 13 PASS NS=5;DP=49 GT:DP:CATG 0/0:9:0,0,9,0 1/1:7:0,7,0,0 0 locus_1 19 . A G 13 PASS NS=5;DP=49 GT:DP:CATG 0/0:9:0,9,0,0 1/1:7:0,0,0,7 locus_1 21 . C T 13 PASS NS=5;DP=49 GT:DP:CATG 1/1:9:0,0,9,0 0/0:7:7,0,0,0 locus_1 30 . T C 13 PASS NS=5;DP=49 GT:DP:CATG 0/0:9:0,0,9,0 0/0:7:0,0,7,0 locus_1 36 . C T 13 PASS NS=5;DP=49 GT:DP:CATG 0/0:9:9,0,0,0 1/1:7:0,0,7,0 locus_2 15 . A G 13 PASS NS=11;DP=210 GT:DP:CATG 0/0:12:0,12,0,0 0/0:24:0,2 locus_2 16 . A T,C 13 PASS NS=11;DP=210 GT:DP:CATG 0/0:12:0,12,0,0 0/0:24:0 locus_2 18 . A T 13 PASS NS=11;DP=210 GT:DP:CATG 0/0:12:0,12,0,0 0/0:24:0,2 locus_2 20 . T C 13 PASS NS=11;DP=210 GT:DP:CATG 1/1:12:12,0,0,0 0/0:24:0,0 locus_2 29 . T C 13 PASS NS=11;DP=209 GT:DP:CATG 0/0:12:0,0,12,0 0/0:23:0,0 locus_2 30 . A G 13 PASS NS=11;DP=209 GT:DP:CATG 0/0:12:0,12,0,0 0/0:23:0,2 locus_2 47 . T A 13 PASS NS=11;DP=210 GT:DP:CATG 0/0:12:0,0,12,0 0/0:24:0,0 locus_3 46 . A T 13 PASS NS=6;DP=69 GT:DP:CATG 1/1:10:0,0,10,0 ./.:0:0,0,0, locus_3 62 . C A 13 PASS NS=6;DP=68 GT:DP:CATG 0/0:10:10,0,0,0 ./.:0:0,0,0, locus_6 11 . G A 13 PASS NS=4;DP=67 GT:DP:CATG 1/1:11:0,11,0,0 0/0:7:0,0,0, locus_6 29 . G A 13 PASS NS=4;DP=67 GT:DP:CATG 1/1:11:0,11,0,0 0/0:7:0,0,0, locus_6 34 . C A 13 PASS NS=4;DP=67 GT:DP:CATG 1/1:11:0,11,0,0 0/0:7:7,0,0, locus_6 35 . G T 13 PASS NS=4;DP=67 GT:DP:CATG 0/0:11:0,0,0,11 0/0:7:0,0,0, locus_6 40 . T C 13 PASS NS=4;DP=67 GT:DP:CATG 0/0:11:0,0,11,0 1/1:7:7,0,0, locus_9 19 . A C 13 PASS NS=13;DP=224 GT:DP:CATG 0/0:12:0,12,0,0 0/0:22:0,2 locus_9 25 . A G 13 PASS NS=13;DP=224 GT:DP:CATG 0/0:12:0,12,0,0 0/0:22:0,2 locus_11 4 . C T 13 PASS NS=10;DP=137 GT:DP:CATG 0/0:11:11,0,0,0 0/0:17:17, locus_11 13 . C T 13 PASS NS=10;DP=137 GT:DP:CATG 1/1:11:0,1,10,0 0/0:17:17 locus_11 21 . G A 13 PASS NS=10;DP=137 GT:DP:CATG 0/0:11:0,0,0,11 0/0:17:0, locus_11 23 . A G 13 PASS NS=10;DP=137 GT:DP:CATG 0/0:11:0,11,0,0 1/1:17:0, locus_11 24 . A T 13 PASS NS=10;DP=137 GT:DP:CATG 1/1:11:0,0,11,0 0/0:17:0, locus_11 38 . G T 13 PASS NS=10;DP=137 GT:DP:CATG 0/0:11:0,0,0,11 0/0:17:0, locus_11 42 . A G 13 PASS NS=10;DP=137 GT:DP:CATG 0/0:11:0,11,0,0 0/0:17:0, locus_11 54 . A G 13 PASS NS=10;DP=137 GT:DP:CATG 0/0:11:0,11,0,0 0/0:17:0, locus_11 55 . A G 13 PASS NS=10;DP=137 GT:DP:CATG 0/0:11:0,11,0,0 0/0:17:0, locus_12 6 . C T 13 PASS NS=7;DP=94 GT:DP:CATG 1/0:7:4,0,3,0 ./.:0:0,0,0,0 locus_12 20 . C T 13 PASS NS=7;DP=94 GT:DP:CATG 0/0:7:7,0,0,0 ./.:0:0,0,0,0 locus_12 31 . T C 13 PASS NS=7;DP=94 GT:DP:CATG 0/0:7:0,0,7,0 ./.:0:0,0,0,0 locus_12 33 . G A 13 PASS NS=7;DP=94 GT:DP:CATG 1/1:7:0,7,0,0 ./.:0:0,0,0,0 locus_12 37 . G T 13 PASS NS=5;DP=52 GT:DP:CATG 0/0:7:0,0,0,7 ./.:0:0,0,0,0 locus_12 43 . C G 13 PASS NS=7;DP=94 GT:DP:CATG 1/1:7:0,0,0,7 ./.:0:0,0,0,0 locus_12 45 . G C,A 13 PASS NS=7;DP=94 GT:DP:CATG 0/0:7:0,0,0,7 ./.:0:0,0,0 locus_12 47 . G T 13 PASS NS=7;DP=94 GT:DP:CATG 0/0:7:0,0,0,7 ./.:0:0,0,0,0