RagTag is a collection of command-line utilities for improving modern genome assemblies. Tasks include:
- Homology-based sequence correction
- Homology-based sequence scaffolding
- Homology-based continuous scaffolding and gap-filling (patching)
- Scaffold merging
Ragtag also provides a collection of command line utilities for working with common genome assembly file formats.
# install with conda
conda install -c bioconda ragtag
# correct contigs
ragtag.py correct ref.fasta query.fasta
# scaffold contigs
ragtag.py scaffold ref.fa ragtag_output/query.corrected.fasta
# scaffold with multiple references
ragtag.py scaffold -o out_1 ref1.fasta query.fasta
ragtag.py scaffold -o out_2 ref2.fasta query.fasta
ragtag.py merge query.fasta out_*/*.agp
# use Hi-C to resolve conflicts
ragtag.py merge -b hic.bam query.fasta out_*/*.agp
# make joins and fill gaps in target.fa using sequences from query.fa
ragtag.py patch target.fa query.fa
Please see the Wiki for detailed documentation.
- Minimap2, Unimap, or Nucmer
- Python 3 (with the following auto-installed packages)
- numpy
- intervaltree
- pysam
- networkx
Alonge, Michael, et al. "RaGOO: fast and accurate reference-guided scaffolding of draft genomes." Genome biology 20.1 (2019): 1-17.
https://doi.org/10.1186/s13059-019-1829-6
Many of the major algorithmic improvements relative to RaGOO's first release were provided by Aleksey Zimin, lead developer of the MaSuRCA assembler. Luca Venturini suggested and initially implemented many feature enhancments, such as pysam integration. RagTag "merge" was inspired by CAMSA. The developer of CAMSA, Sergey Aganezov, helped review relevant RagTag code. RagTag "patch" was inspired by Grafter, a scaffolding tool written by Melanie Kirsche. Melanie provided guidance for the RagTag implementation. Michael Schatz has provided guidance for the whole project.