Having too many short contigs #1133

chtsai0105 · 2023-05-05T00:03:22Z

Description of bug

I launched a job two weeks ago and it finally completed today. It is a soil metagenome sample which have been preprossed by fastp/bfc (I used bfc instead of the built-in bayesHammer since I stuck at subclustering hamming graph like this case #703) and still have 100M paired reads after trimming.

The spades stuck at Running Bulge remover this time for a very long time. After it was done today I checked the number of contigs by grep \> contigs.fasta | wc -l and there are 11,955,485 contigs! I haven't check the length distribution yet but I noticed there are tons of contigs only 55 bp in length. I guess writing those short contigs causes the I/O bottleneck which causes the slowdown of the entire process. But maybe there's another reason that causes the slowdown.

I'm also wondering if that is possible to integrate a min length filter to prevent reporting all these small contigs. If it is truly a I/O bound issue then maybe that can improve the speed.

spades.log

params.txt

SPAdes version

3.15.5

Operating System

Linux-4.18.0-348.12.2.el8_5.x86_64-x86_64-with-glibc2.28

Python Version

3.11.0

Method of SPAdes installation

conda

No errors reported in spades.log

Yes

The text was updated successfully, but these errors were encountered:

chtsai0105 · 2023-05-05T00:25:05Z

Maybe I was not totally right. There are tons of contigs shorter than 1000 bp, especially around 125 bp.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Having too many short contigs #1133

Having too many short contigs #1133

chtsai0105 commented May 5, 2023 •

edited

Loading

chtsai0105 commented May 5, 2023

Having too many short contigs #1133

Having too many short contigs #1133

Comments

chtsai0105 commented May 5, 2023 • edited Loading

Description of bug

spades.log

params.txt

SPAdes version

Operating System

Python Version

Method of SPAdes installation

No errors reported in spades.log

chtsai0105 commented May 5, 2023

chtsai0105 commented May 5, 2023 •

edited

Loading