Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect alignment of long contigs / problem with bandwidth tracking? #89

Open
pmarks opened this issue Sep 22, 2016 · 2 comments
Open

Comments

@pmarks
Copy link

pmarks commented Sep 22, 2016

I'm aligning long assembled contigs (from Supernova) to hg19. Occasionally bwa mem will generate long runs of bad alignments with many long indels, for sequence that actually matches well.
From tracing through some examples, it appears that alignments merged by mem_patch_reg will sometime not be subsequently processed by bwa_gen_cigar2 with a large enough bandwidth to generate a correct alignment. I can work around the issue by raising PATCH_MIN_SC_RATIO of 0.95 (https://github.com/lh3/bwa/blob/master/bwamem.c#L404), but I imagine there's a better solution.

I attached an example sequence which triggers the problem when aligned to hg19, on the latest bwa mem code. You can see the issue at chr2:122,075,076-122,087,607.

Does not show the problem:
bwa mem -v 5 -x intractg -w 500 -d 200 /mnt/opt/refdata_new/hg19-2.0.0/fasta/genome.fa 4535_slice.fasta

Shows the problem:
bwa mem -v 5 -x intractg -w 600 -d 200 /mnt/opt/refdata_new/hg19-2.0.0/fasta/genome.fa 4535_slice.fasta

4535_slice.fasta.txt

IGV tracks of the alignments resulting from the above command are show here:
image

Thanks!
Pat Marks

pmarks added a commit to pmarks/bwa that referenced this issue Mar 3, 2017
…ong block of incorrect alignments.

(see lh3#89 for details).  Mitigate the issue by expanding the bandwidth more
aggressively when merging nearby chains.
@mdkeehan
Copy link

Aligning contigs from multiple supernova denovo assemblies to a reference assembly is a usecase I have as well.
Contigs can be megabases in size and I wish to have more global alignment behaviour so we can discover regions of high divergence between the supernova assemblies and the reference assembly.
We have large memory machines so could we crank up -w and -d to say 10,000 ?

Some discussion on appropriate settings for bwa mem would be highly appreciated.

@lh3
Copy link
Owner

lh3 commented Aug 14, 2017

Use minimap2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants