Skip to content

Commit

Permalink
Force full-length ivar consensus builds (galaxyproject#5115)
Browse files Browse the repository at this point in the history
* Force full-length ivar consensus builds

This requires the -a option of samtools mpileup or zero-coverage regions
at the ref sequence ends will silently be dropped and a truncated
consensus be produced.

* More fixes to the consensus tool

- Protect params against integer overflow
- Make "-n -" option work and simplify choice of min depth threshold action
  • Loading branch information
wm75 authored Feb 10, 2023
1 parent 381e8b8 commit fc9887d
Showing 1 changed file with 24 additions and 20 deletions.
44 changes: 24 additions & 20 deletions tools/ivar/ivar_consensus.xml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<tool id="ivar_consensus" name="ivar consensus" version="@TOOL_VERSION@+galaxy1" profile="@PROFILE@">
<tool id="ivar_consensus" name="ivar consensus" version="@TOOL_VERSION@+galaxy2" profile="@PROFILE@">
<description>Call consensus from aligned BAM file</description>
<macros>
<import>macros.xml</import>
Expand All @@ -9,21 +9,20 @@
#import re
#set $clean_name = re.sub('[^\w\-]', '_', str($input_bam.element_identifier))
ln -s '$input_bam' sorted.bam &&
samtools mpileup -A -d 0 -Q 0 sorted.bam | ivar consensus
samtools mpileup -A -a -d 0 -Q 0 sorted.bam | ivar consensus
-p consensus
-q $min_qual
-t $min_freq
-m $min_depth
$filter_depth
#if $gap
-n N
#end if
$depth_action
&&
sed -i "s|consensus|$clean_name|" consensus.fa
]]> </command>
]]></command>
<inputs>
<param name="input_bam" type="data" format="bam" label="Bam file" help="Aligned reads, to trim primers and quality"/>
<param name="min_qual" argument="-q" type="integer" min="0" value="20" label="Minimum quality score threshold to count base"/>
<!-- Warning: integer params in the following define a max of 255 intentionally
because the underlying C++ code of ivar defines them as uint8 and does not check for overflow! -->
<param name="min_qual" argument="-q" type="integer" min="0" max="255" value="20" label="Minimum quality score threshold to count base"/>
<param name="min_freq" argument="-t" type="float" min="0" max="1" value="0.0" label="Minimum frequency threshold">
<help>
<![CDATA[
Expand All @@ -35,17 +34,19 @@
]]>
</help>
</param>
<param name="min_depth" argument="-m" type="integer" min="1" value="10" label="Minimum depth to call consensus"/>
<param name="filter_depth" argument="-k" type="boolean" truevalue="-k" falsevalue="" checked="false" label="Exclude regions with smaller depth than the minimum threshold"/>
<param name="gap" argument="-n" type="boolean" truevalue="true" falsevalue="false" checked="false" label="Use N instead of - for regions with less than minimum coverage"/>
<param name="min_depth" argument="-m" type="integer" min="1" max="255" value="10" label="Minimum depth to call consensus"/>
<param name="depth_action" type="select" label="How to represent positions with coverage less than the minimum depth threshold">
<option value="-k">Drop from output (-k)</option>
<option value="-n N" selected="true">Represent as N (-n N)</option>
<option value="-n -">Represent as - (-n -)</option>
</param>
</inputs>
<outputs>
<data name="consensus" format="fasta" label="${tool.name} on ${on_string} Consensus" from_work_dir="consensus.fa"/>
</outputs>
<tests>
<test>
<param name="input_bam" value="covid19/PC00101P_sub.trimmed.sorted.bam" />
<param name="gap" value="true" />
<output name="consensus" file="covid19/PC00101P_sub.fa" ftype="fasta" compare="contains" lines_diff="1"/>
</test>
</tests>
Expand All @@ -55,20 +56,23 @@
To generate a consensus sequence iVar uses the output of samtools mpileup
command. The mpileup output must be piped into ivar consensus
The command for this wrapper is from https://github.com/andersen-lab/ivar/blob/master/pipeline_consensus/Snakefile :
The command formed by this wrapper is :
samtools mpileup -A -d 0 -Q 0 sorted.bam | ivar consensus [options]
samtools mpileup -A -a -d 0 -Q 0 sorted.bam | ivar consensus [options]
There are five parameters that can be set:
There are four parameters that can be set:
- Minimum quality (Default: 20): the minimum quality of a base to be considered in calculations of variant frequencies at a given position
- **Minimum quality**: the minimum quality of a base to be considered in calculations of variant frequencies at a given position
- Minimum frequency threshold (Default: 0): the minimum frequency that a base must match to be called as the consensus base at a position.
- **Minimum frequency threshold**: the minimum frequency that the most likely base must surpass to be called as the consensus base at a position.
- Minimum depth to call a consensus (Default: 1): the minimum required depth to call a consensus
- **Minimum depth to call consensus**: the minimum required depth to call a consensus base
- **How to represent positions with coverage less than the minimum depth threshold**: for positions for which the above minimum depth to call a consensus base is not reached, you can choose one of three different actions:
- Filter depth is a flag to exclude nucleotides from regions with depth less than the minimum depth and a character to call in regions with coverage lower than the speicifed minimum depth(Default: '-'). If this flag is set then these regions are not included in the consensus sequence. If it is not set then by default, a '-' is called in these regions.
- You can also specfy which character you want to add to the consensus to cover regions with depth less than the minimum depth. This can be done using gap option. It takes of two values: '-' or 'N'.
- Drop the position from the output entirely (-> the consensus sequence can become shorter than the reference used to produce the input BAM!)
- Use an ``N``, or
- Use a ``-`` to represent the position
]]> </help>
<expand macro="citations" />
</tool>

0 comments on commit fc9887d

Please sign in to comment.