Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DIA-NN runs fast or slow, depending on the run #1282

Open
L3ft2di3 opened this issue Nov 27, 2024 · 9 comments
Open

DIA-NN runs fast or slow, depending on the run #1282

L3ft2di3 opened this issue Nov 27, 2024 · 9 comments

Comments

@L3ft2di3
Copy link

Dear DIA-NN Team,
Once again, a tremendous amount of work has gone into building and improving DIA-NN.
We recently started using a new Windows machine for DIA-NN processing, relatively powerful with an i9-14900k, 128 GB DDR5 Ram and NVME SSDs. Overall we are very pleased with the analysis speed, which in most cases exceeds our old system by a factor of 5. We have recently upgraded to version 1.9.2. The data is acquired on a timsTOF-HT.
However, we have observed VERY long processing times for some runs and can't find a reason. A library-free analysis of 64 files with a previously predicted lib is still running. During the first 40 minutes, 3/64 runs were processed in the first pass, but run 4 took 2080 minutes to process. The number of protein IDs is comparable to the other runs. Run 5 took 532 min, run 6 took 12 min, run 7 took 153 min, run 8 took 13 min.
All the necessary files are located locally on the machine, nothing needs to be pulled over the network. I have monitored the CPU Temp & Clock-speeds and under load it seems to be relatively stable.
We have processed this type of data on the machine before and have never observed this variance in processing time (at least when ProteinIDs were comparable). Do you have any idea what might be causing this problem?

I have attached the current state of the log text.
DIANN_log_20241107.txt

Thanks in advance for any response!
Cheers,
Thorben

@vdemichev
Copy link
Owner

vdemichev commented Nov 27, 2024

Hi Thorben,

This is almost certainly OS swapping RAM to disk, although I am surprised to see this kind of slowdown given it's an SSD. Looks like a severe lack of memory, although this is not expected given 128Gb. Could it be that something else was run in parallel? If you restart the analysis and monitor RAM usage with Task Manager or https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer, does it slow down on those files again? If this is consistent, would you please be able to upload the .speclib and the .d folder that causes the problem somewhere and I take a look?

Best,
Vadim

@L3ft2di3
Copy link
Author

L3ft2di3 commented Nov 27, 2024

Thanks for the swift response!
Only a temp monitoring tool ran in parallel (NZXT cam), but that barely used any resources.
I will restart the analysis and monitor the RAM usage closely. If the inconsistency persists, I will happily share the files with you.

Best,
Thorben

@L3ft2di3
Copy link
Author

Hi Vadim,

oddly, the processing of run 4 now took about 14 min, and run 5 took 12 min. RAM Usage did not exceed 29 GB. I'll let DIA-NN do its thing for the rest of the runs and will report back on how it went.

Is this relatively low RAM usage common? I would've expected a much higher utilization.

Best,
Thorben

@vdemichev
Copy link
Owner

These are expected values for this kind of analysis, so seems all fine.

@L3ft2di3
Copy link
Author

L3ft2di3 commented Dec 9, 2024

Dear Vadim,
the analysis is finished by now. Again, we observed very long analysis times for some files, while others went super quick.
Oddly, the files that took long last time went quickly this time. The log file is attached to this mail. I tried to keep an eye on RAM usage, but never observed usage >35 GB.
The CPU was quite hot (+- 90°C) at some times, but clock rates were relatively stable.
Are you interested in checking out some raw files, although the issue does not seem file-dependent?

All the best,
Thorben
20241122_SingleCell_Clara_report.log.txt

@vdemichev
Copy link
Owner

Hi Thorben,

Yes, this is pretty strange. If it happens on some file only one run but not the other, since DIA-NN does exact same thing, something slows it down. I wonder if CPU usage (Process Xplorer or Task Manager) for DIA-NN stays full if it in the middle of such a 'slow file' analysis.

Best,
Vadim

@L3ft2di3
Copy link
Author

L3ft2di3 commented Dec 9, 2024

Hey Vadim,

Every time I checked, the CPU utilization was pretty high - total usage between 70-100%. I think I could push the threads setting higher, had it at 24 but I think the CPU has 32 threads available. But I guess this still would not explain the issue, right?

Best, Thorben

@vdemichev
Copy link
Owner

Maybe worth checking with 1.9.2 (i.e. using it for all new analyses). If still has this problem would you please then share one of the runs and the predicted library and I see if this is reproducible on my PC?

@L3ft2di3
Copy link
Author

L3ft2di3 commented Dec 9, 2024

Sure, I will try 1.9.2 as soon as the PC is free again. In between we are using other files to see if the issue persists with them too - however on 1.9.1. I'll let you know if we still encounter this and then I will also share some files.
All the best,
Thorben

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants