Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having too many short contigs #1133

Open
1 task done
chtsai0105 opened this issue May 5, 2023 · 1 comment
Open
1 task done

Having too many short contigs #1133

chtsai0105 opened this issue May 5, 2023 · 1 comment

Comments

@chtsai0105
Copy link

chtsai0105 commented May 5, 2023

Description of bug

I launched a job two weeks ago and it finally completed today. It is a soil metagenome sample which have been preprossed by fastp/bfc (I used bfc instead of the built-in bayesHammer since I stuck at subclustering hamming graph like this case #703) and still have 100M paired reads after trimming.

The spades stuck at Running Bulge remover this time for a very long time. After it was done today I checked the number of contigs by grep \> contigs.fasta | wc -l and there are 11,955,485 contigs! I haven't check the length distribution yet but I noticed there are tons of contigs only 55 bp in length. I guess writing those short contigs causes the I/O bottleneck which causes the slowdown of the entire process. But maybe there's another reason that causes the slowdown.

I'm also wondering if that is possible to integrate a min length filter to prevent reporting all these small contigs. If it is truly a I/O bound issue then maybe that can improve the speed.

spades.log

spades.log

params.txt

params.txt

SPAdes version

3.15.5

Operating System

Linux-4.18.0-348.12.2.el8_5.x86_64-x86_64-with-glibc2.28

Python Version

3.11.0

Method of SPAdes installation

conda

No errors reported in spades.log

  • Yes
@chtsai0105
Copy link
Author

Maybe I was not totally right. There are tons of contigs shorter than 1000 bp, especially around 125 bp.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant