Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multimapping and reference sequence junctions #379

Open
s-a-nersisyan opened this issue Nov 1, 2022 · 0 comments
Open

Multimapping and reference sequence junctions #379

s-a-nersisyan opened this issue Nov 1, 2022 · 0 comments

Comments

@s-a-nersisyan
Copy link

Hello,

I've experienced an issue which could be IMHO treated as a bug. As documented, bwa concatenates all sequences from the reference fasta file, and some reads could map to the junction of two reference sequences – in this case the read is marked as unmapped (SAM flag = 4), though it still reports the position.

The problem arises with multiple mappers. As documented, bwa randomly selects one of the multiple alignments as the “primary” one, leaving the others in XA tag. The trick is that sometimes the “primary” random chosen position is exactly the undesired junction, and at the same time some good alignments are marked as secondary and stored in XA. In this case we think that the read is unmapped (flag = 4), though it actually could be mapped (sometimes even without errors). Moreover, if we have the same molecule present multiple times in fastq file, some of the reads could be mapped, while other (exactly same) reads will be unmapped – this is a random process. The only solution I invented to tackle this problem is to add poly-N sequences to the ends of each sequence in fasta file.

I understand that this bug is very-very unlikely in the most of real-world setups. However, I massively experienced this while trying to align short reads to Y RNA sequences: there are lot of repeats, and a lot of reads mapped to the Y RNA molecule junctions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant