-
Notifications
You must be signed in to change notification settings - Fork 2
mfiers/leapfrog
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
LeapFrog ======== A set of tools that allows the genomic localization of (flanking regions of) repetitive elements based on read-pair information. analysis steps & data sets are: * (1) input fastq The paired-end fastq files generated from the organism that you are interested in. This MUST be paired end! * (2) reference fasta Contains the genome reference sequence * (3) element_database The database is a multi-fasta file elements that are to be located. The software expects the sequence headers to have the following format: `>NAME#FAMILY` * (4) bowtie2db for the reference fasta bowtie2 database based on (2) * (5) bowtie2db for the element database bowtie2 database based on (3) * (6) get danglers the leapfrog script `lf_danglers` will run bowtie2 in the background and output a properly renamed fastq file containing the "danglers". A "dangler" is a read that does not map to the element database (3), but it's paired end mate does! ____ / \ ===== ===== <- dangler +========================+ | A sequence from the | | element database (3/5) | +========================+ Check how to run the script using the `-h` parameter. This script takes as input the element bowtie2 database (5) and the input fastq (1). * (7) map the danglers to the reference genome run a regular bowtie2 job mapping the dangler sequences (6) against the reference genome (2/4) * (8) extract PFR's from the BAM alignment from (7) using the script lf_regionify. This script needs to be executed for each genome/sample separately. The output is a GFF file identifygin each PFR separately. The script splits PFR's based on family and orientation and tries to unmerge peaks that are close together. A score is assigned to each PFR. * (9) compare PFR's between genomes This script (lf_findiff) is still very experimental. It takes a number of input GFF PFR files and determines which one overlap, followed by absence presence information.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published