-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Welcome to the FSV-Fragsifier wiki!
Fragsifier is an STR sequence extraction tool that uses sequence models to identify STR sequences.
The Fragsifier algorithm performs STR extraction on individual lines from the input file, so any input file type that contains sequences/reads in rows are valid inputs for the algorithm. When given a FASTQ file Fragsifier will skip the header and quality lines.
Fragsifier produces two output files, a file containing the extracted sequence from each line in the input file, and a file containing the cumulated read counts for each unique sequence.
Extractions file Each line in the extractions file reports the STR sequence extracted from the line/read. It is empty if no STRs were found. Each line contains results information separated by colons and informs the STR locus, the orientation of the sequence (forward, reverse), the extracted sequence, and the flanking sequence alignment score.
DYS481:F:CTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT:25.0
Sequences file Each line in the sequences file reports the read counts for a unique STR sequence. Each line contains results information separated by colons and informs the STR locus, the extracted sequence, reads in forward orientation, reads in reverse orientation, total reads, and the allele name/number calculated from the sequence.
DYS389I,TCTGTCTGTCTGTCTATCTATCTATCTATCTATCTATCTATCTATCTATCTA,5780,0,5780,13
As Fragsifier use repeat stretches to identify STRs, other non-repeating markers in the input data (such as Amelogenin and SNPs) will not be detected.