Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom PTM in silico library generation #1207

Open
Shinya-Watanabe opened this issue Oct 10, 2024 · 14 comments
Open

Custom PTM in silico library generation #1207

Shinya-Watanabe opened this issue Oct 10, 2024 · 14 comments

Comments

@Shinya-Watanabe
Copy link

Hi Vadim,

I have been using Dia-NN through FragPipe to analyze custom PTMs (one is in UniMod but the other is not). I managed to run it but PTMProphet has to be disabled because it has an issue with custom PTMs on protein c-terminus (Nesvilab/FragPipe#1815). I assume disabling PTMProphet caused no PTM localization probability columns in report files. Also, Dia-NN in FragPipe may not be able to run with library free search, so I tried to use Dia-NN GUI to generate in silico library with custom PTMs. It does not generate the library unless deep-leaning based prediction is on, but I read somewhere in the issues that it does not support custom PTMs.

  1. Can you tell me how to generate in silico spectra library?
  2. When I analyze .d files using that library, what check boxes do I need to turn off?

10_10_2024 09_47_01.txt
Screenshot 2024-10-10 103959

@vdemichev
Copy link
Owner

vdemichev commented Oct 10, 2024

Hi,

It does not generate the library unless deep-leaning based prediction is on

Yes, it must always be on when generating an in silico lib.

but I read somewhere in the issues that it does not support custom PTMs

The performance is suboptimal with PTMs it has not been trained on. It still works though.

When I analyze .d files using that library, what check boxes do I need to turn off?

MBR & 'Generate spectral library' should be selected, everything else should be default.

Btw, please note that on the screenshot --var-mod is used with incorrect syntax, it will not understand the meaning of '*c'

Best,
Vadim

@Shinya-Watanabe
Copy link
Author

Thank you, Vadim.

I will try with that. Can you provide me with the accurate syntax for protein c terminal modification? I modified the example on the documentation "--var-mod UniMod:1,42.010565,*n" Also, I cannot find syntax for command-line options. There is a list of --command-line options but I do not know what can go in there.

Best,
Shinya

@vdemichev
Copy link
Owner

vdemichev commented Oct 10, 2024

for protein c terminal modification?

Not supported in DIA-NN at the moment unfortunately. Thank you for pointing this out, I've added this to the todo list.

@Shinya-Watanabe
Copy link
Author

Thank you! Looking forward to it.

If that is the case, I have strange data. When I used Fragpipe for this. I got protein c terminal modification peptides (Please see attached, I filtered "Modified.Sequence" by ending with "734)". The modification was searched on c-terminal protein, D, or E, then anything modified with UniMod:734 at peptide c-terminus other than DE is protein c-terminus). Is this because I generated spectra library using FragPipe?

report.pr_matrix.xlsx
report.log.txt

@vdemichev
Copy link
Owner

I can't really comment on FragPipe algorithms (better ask FragPipe team, they usually reply super helpful on github), but with regard to the spectral library generated by FragPipe - you can just examine it in R or Python, it's a simple text table (library.tsv), can see what is there - DIA-NN just searches the peptides in the library.

@Shinya-Watanabe
Copy link
Author

I just quick look at the library.tsv generated by FragPipe, and it contains peptide with modifications on protein c-terminus. I will try if FragPipe can make in silico spectra library with protein c-terminal modification, so I can use it to run Dia-NN for it.

Thank you for your help, Vadim!
Best,
Shinya

@vdemichev
Copy link
Owner

I just quick look at the library.tsv generated by FragPipe, and it contains peptide with modifications on protein c-terminus

So does searching with it using DIA-NN produce an expected result?

@Shinya-Watanabe
Copy link
Author

Shinya-Watanabe commented Oct 11, 2024

Yes, I got somewhat expected result (I got some peptide modified at protein c-terminus) but not the best (missing PTM probability info) probably due to PTMProphet. I attached Fragpipe log. However, PTMProphet also has an issue with searching modification on protein c-terminus. I asked this to Fragpipe team (Nesvilab/FragPipe#1815), and they are working on this. I assume the Dia-NN output (e.g., report_pr_matrix.tsv) does not contain PTM probability because of disabled PTMProphet.

Let me know if you want me to test something or provide you my dataset.

log_2024-10-06_07-22-52.txt

@vdemichev
Copy link
Owner

does not contain PTM probability because of disabled PTMProphet.

Need to use --peptidoforms, then DIA-NN will produce peptidoform q-values. For localisation, need declare the modifications with --var-mod.

Best,
Vadim

@Shinya-Watanabe
Copy link
Author

  1. I tried --peptidoforms in "cmd line opts" in DIA-NN in FragPipe, and it returned an error "WARNING: unrecognised option [--peptidoforms]"
  2. "--var-mod" in DIA-NN in FragPipe did not produce localization probability columns in the output.
  3. I tried DIA-NN GUI with "library.tsv" created by Fragpipe. Unknown modification error caused termination of the process. ERROR: D:\diann\src\diann.cpp: 4064: unknown modification: 216.07462 I did not add modification mass of 216.07462 when creating the specta library, but it is in library.tsv. It may be produced by FragPipe for some reason.
DIA-NN_bBJ7Kr3Pe8

@vdemichev
Copy link
Owner

This happens because FragPipe packages an old DIA-NN version.
Solutions:

  • Use the library generated by FragPipe in DIA-NN 1.9.1.
  • Link FragPipe to 1.9.1 (please see FragPipe docs on how to do that).
  • Use --monitor-mod UniMod:734 --monitor-mod PS instead of --peptidoforms (--var-mod also needs to be specified).

I did not add modification mass of 216.07462 when creating the specta library, but it is in library.tsv. It may be produced by FragPipe for some reason.

Can just declare it with --mod

@Shinya-Watanabe
Copy link
Author

Shinya-Watanabe commented Oct 16, 2024

Use the library generated by FragPipe in DIA-NN 1.9.1.

This worked. I downloaded DIA-NN 1.9.1. and switched with old DIA-NN in FragPipe.

Use --monitor-mod UniMod:734 --monitor-mod PS instead of --peptidoforms (--var-mod also needs to be specified).

Thank you. I will try this too, but I figured out that modification name in the spectral library created by FragPipe is named either UniMod:XXX (if the mass matches in the UniMod) or the mass itself (e.g., 87.03203). Thus, I needed to declare --mod 87.03203 or --var-mod 87.03203,87.03203,DE. Also, if the modification mass is not in UniMod, FragPipe creates library.tsv with amino acid + mod mass as mod name (e.g., for E, 216.07462; for D, 202.05896; for C-term, 87.026746).
Here's examples.

E[216.07462]VAGAKPHITAAEGK
AENLGGPGAGAGTLAGKDA.[87.026746]
E(UniMod:734)VD[202.05896]ATSPAPSTSSTVK

Best,
Shinya

@vdemichev
Copy link
Owner

Hi Shinya,

Yes, it's fine if the name is different. DIA-NN accepts arbitrary strings in parentheses ([ ] or ( )) as the modificaiton names, just important to let DIA-NN know about these using either of --mod, --var-mod or --fixed-mod.

Best,
Vadim

@Shinya-Watanabe
Copy link
Author

Thank you, Vadim! I am looking forward to the function to recognize protein c-terminal modifications.

Best,
Shinya

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants