This workflow uses RepeatModeler and RepeatMasker for genome analysis.
-
RepeatModeler is a software package for identifying and modeling de novo families of transposable elements (TEs). At the heart of RepeatModeler are three de novo repeat search programs (RECON, RepeatScout and LtrHarvest/Ltr_retriever) which use complementary computational methods to identify repeat element boundaries and family relationships from sequence data.
-
RepeatMasker is a program that analyzes DNA sequences for interleaved repeats and low-complexity DNA sequences. The result of the program is a detailed annotation of the repeats present in the query sequence, as well as a modified version of the query sequence in which all annotated repeats are present.
- RepeatModeler requires a single input file, a genome in fasta format.
- Two output files are generated:
- summary file (.tbl)
- fasta file containing alignments in order of appearance in the query sequence
- ReapatMasker requires the fasta file generated by RepeatModeler
- Five output files are generated:
- a fasta file
- .gff3 file
- a table summarizing the repeated content of the sequence analyzed
- a file with statistics related to the repeated content of the sequence analyzed
- a summary of the mutation sites found and the order of grouping