You can see the usage options of GraphBin2 by typing ./graphbin2 -h
on the command line. For example,
Usage: graphbin2 [OPTIONS]
GraphBin2: Refined and Overlapped Binning of Metagenomic Contigs Using
Assembly Graphs GraphBin2 is a tool which refines the binning results
obtained from existing tools and, is able to assign contigs to multiple
bins. GraphBin2 uses the connectivity and coverage information from
assembly graphs to adjust existing binning results on contigs and to infer
contigs shared by multiple species.
Options:
--assembler [spades|sga|flye] name of the assembler used. (Supports SPAdes,
SGA and Flye) [required]
--graph PATH path to the assembly graph file [required]
--contigs PATH path to the contigs file [required]
--paths PATH path to the contigs.paths (metaSPAdes) or
assembly.info (metaFlye) file
--abundance PATH path to the abundance file [required]
--binned PATH path to the .csv file with the initial
binning output from an existing toole
[required]
--output PATH path to the output folder [required]
--prefix TEXT prefix for the output file
--depth INTEGER maximum depth for the breadth-first-search.
[default: 5]
--threshold FLOAT threshold for determining inconsistent
vertices. [default: 1.5]
--delimiter [,|;|$'\t'|" "] delimiter for output results. Supports a
comma (,), a semicolon (;), a tab ($'\t'), a
space (" ") and a pipe (|) . [default: ,]
--nthreads INTEGER number of threads to use. [default: 8]
-v, --version Show the version and exit.
--help Show this message and exit.
The SPAdes version of graphbin2
takes in 4 files as inputs (required).
- Contigs file (in
.fasta
format) - Assembly graph file (in
.gfa
format) - Paths of contigs (in
.paths
format) - Binning output from an existing tool (in
.csv
format)
The SGA version of graphbin2
takes in 4 files as inputs (required).
- Contigs file (in
.fasta
format) - Abundance file (tab separated file with contig ID and coverage in each line)
- Assembly graph file (in
.asqg
format) - Binning output from an existing tool (in
.csv
format)
The Flye version of graphbin2
takes in 4 files as inputs (required).
- Contigs file (in
.fasta
format) - Abundance file (tab separated file with contig ID and coverage in each line)
- Assembly graph file (in
.gfa
format) - Binning output from an existing tool (in
.csv
format)
Note: The abundance file (e.g., abundance.abund
) is a tab separated file with contig ID and the coverage for each contig in the assembly. metaSPAdes provides the coverage of each contig in the contig identifier of the final assembly. We can directly extract these values to create the abundance.abund file. However, no such information is provided for contigs produced by SGA. Hence, reads should be mapped back to the assembled contigs in order to determine the coverage of SGA contigs.
Note: Make sure that the initial binning result consists of contigs belonging to only one bin. GraphBin2 is designed to handle initial contigs which belong to only one bin.
Note: You can specify the delimiter for the initial binning result file and the final output file using the delimiter
paramter. Enter the following values for different delimiters; ,
for a comma, ;
for a semicolon, $'\t'
for a tab, " "
for a space and |
for a pipe.
graphbin2 --assembler spades --contigs /path/to/contigs.fasta --graph /path/to/graph_file.gfa --paths /path/to/paths_file.paths --binned /path/to/binning_result.csv --output /path/to/output_folder
graphbin2 --assembler sga --contigs /path/to/contigs.fa --abundance /path/to/abundance.tsv --graph /path/to/graph_file.asqg --binned /path/to/binning_result.csv --output /path/to/output_folder
graphbin2 --assembler flye --contigs /path/to/edges.fasta --abundance /path/to/abundance.tsv --graph /path/to/graph_file.gfa --binned /path/to/binning_result.csv --output /path/to/output_folder