Skip to content

Commit

Permalink
updated documentations
Browse files Browse the repository at this point in the history
(prepare for release)
  • Loading branch information
lh3 committed Dec 21, 2014
1 parent c05a721 commit ed95769
Show file tree
Hide file tree
Showing 3 changed files with 58 additions and 14 deletions.
20 changes: 11 additions & 9 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,25 +11,27 @@ For general uses, the single BWA binary still works like the old way.

Another major addition to BWA-MEM is HLA typing, which made possible with the
new ALT mapping strategy. Necessary data and programs are included in the
binary release. The wrapper script also performs HLA typing when HLA genes are
included in the reference genome as additional ALT contigs.
binary release. The wrapper script also optionally performs HLA typing when HLA
genes are included in the reference genome as additional ALT contigs.

Other notable changes to BWA-MEM:

* Added option `-b` to `bwa index`. This option tunes the batch size used in
the construction of BWT. It is advised to use large `-b` for huge reference
sequences such as the *nt* database.
sequences such as the BLAST *nt* database.

* Optimized for PacBio data. This includes a change to the scoring based on a
mini-study done by Aaron Quinlan and a heuristic speedup. Further speedup is
* Optimized for PacBio data. This includes a change to scoring based on a
study done by Aaron Quinlan and a heuristic speedup. Further speedup is
possible, but needs more careful investigation.

* Dropped PacBio read-to-read alignment for now. BWA-MEM is only good at
finding the best hit, not all hits. Option `-x pbread` is still available,
but hidden on the command line.
* Dropped PacBio read-to-read alignment for now. BWA-MEM is good for finding
the best hit, but is not very sensitive to suboptimal hits. Option `-x pbread`
is still available, but hidden on the command line. This may be removed in
future releases.

* Added a new pre-setting for Oxford Nanopore 2D reads. LAST is still a little
more sensitive on bacterial data, but bwa-mem is times faster on human data.
more sensitive on older bacterial data, but bwa-mem is as good on more
recent data and is times faster for mapping against mammalian genomes.

* Added LAST-like seeding. This improves the accuracy for longer reads.

Expand Down
50 changes: 46 additions & 4 deletions bwa.1
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.TH bwa 1 "18 November 2014" "bwa-0.7.11-r999" "Bioinformatics tools"
.TH bwa 1 "21 December 2014" "bwa-0.7.11-r1032" "Bioinformatics tools"
.SH NAME
.PP
bwa - Burrows-Wheeler Alignment Tool
Expand Down Expand Up @@ -75,7 +75,7 @@ appropriate algorithm will be chosen automatically.
.TP
.B mem
.B bwa mem
.RB [ -aCHMpP ]
.RB [ -aCHjMpP ]
.RB [ -t
.IR nThreads ]
.RB [ -k
Expand All @@ -88,6 +88,12 @@ appropriate algorithm will be chosen automatically.
.IR seedSplitRatio ]
.RB [ -c
.IR maxOcc ]
.RB [ -D
.IR chainShadow ]
.RB [ -m
.IR maxMateSW ]
.RB [ -W
.IR minSeedMatch ]
.RB [ -A
.IR matchScore ]
.RB [ -B
Expand All @@ -102,6 +108,8 @@ appropriate algorithm will be chosen automatically.
.IR unpairPen ]
.RB [ -R
.IR RGline ]
.RB [ -H
.IR HDlines ]
.RB [ -v
.IR verboseLevel ]
.I db.prefix
Expand Down Expand Up @@ -193,9 +201,28 @@ Discard a MEM if it has more than
.I INT
occurence in the genome. This is an insensitive parameter. [500]
.TP
.BI -D \ INT
Drop chains shorter than
.I FLOAT
fraction of the longest overlapping chain [0.5]
.TP
.BI -m \ INT
Perform at most
.I INT
rounds of mate-SW [50]
.TP
.BI -W \ INT
Drop a chain if the number of bases in seeds is smaller than
.IR INT .
This option is primarily used for longer contigs/reads. When positive, it also
affects seed filtering. [0]
.TP
.B -P
In the paired-end mode, perform SW to rescue missing hits only but do not try to find
hits that fit a proper pair.

.TP
.B SCORING OPTIONS:
.TP
.BI -A \ INT
Matching score. [1]
Expand Down Expand Up @@ -244,15 +271,30 @@ and will be converted to a TAB in the output SAM. The read group ID will be
attached to every read in the output. An example is '@RG\\tID:foo\\tSM:bar'.
[null]
.TP
.BI -H \ ARG
If ARG starts with @, it is interpreted as a string and gets inserted into the
output SAM header; otherwise, ARG is interpreted as a file with all lines
starting with @ in the file inserted into the SAM header. [null]
.TP
.BI -T \ INT
Don't output alignment with score lower than
.IR INT .
This option affects output and occasionally SAM flag 2. [30]
.TP
.BI -h \ INT
.BI -j
Treat ALT contigs as part of the primary assembly (i.e. ignore the
.I db.prefix.alt
file).
.TP
.BI -h \ INT[,INT2]
If a query has not more than
.I INT
hits with score higher than 80% of the best hit, output them all in the XA tag [5]
hits with score higher than 80% of the best hit, output them all in the XA tag.
If
.I INT2
is specified, BWA-MEM outputs up to
.I INT2
hits if the list contains a hit to an ALT contig. [5,200]
.TP
.B -a
Output all found alignments for single-end or unpaired paired-end reads. These
Expand Down
2 changes: 1 addition & 1 deletion fastmap.c
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ int main_mem(int argc, char *argv[])
fprintf(stderr, " -p smart pairing (ignoring in2.fq)\n");
fprintf(stderr, " -R STR read group header line such as '@RG\\tID:foo\\tSM:bar' [null]\n");
fprintf(stderr, " -H STR/FILE insert STR to header if it starts with @; or insert lines in FILE [null]\n");
fprintf(stderr, " -j ignore ALT contigs\n");
fprintf(stderr, " -j treat ALT contigs as part of the primary assembly (i.e. ignore <idxbase>.alt file)\n");
fprintf(stderr, "\n");
fprintf(stderr, " -v INT verbose level: 1=error, 2=warning, 3=message, 4+=debugging [%d]\n", bwa_verbose);
fprintf(stderr, " -T INT minimum score to output [%d]\n", opt->T);
Expand Down

0 comments on commit ed95769

Please sign in to comment.