Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP experiment with DIA quntification #1136

Open
wangrui85 opened this issue Aug 18, 2024 · 4 comments
Open

IP experiment with DIA quntification #1136

wangrui85 opened this issue Aug 18, 2024 · 4 comments

Comments

@wangrui85
Copy link

wangrui85 commented Aug 18, 2024

Dear Vadim,

So great help in our analysis of DIA worflow!!! I have some details during my AP-MS analysis

  1. If I need no-normalized quantification, could I replace "Precursor.Normalised" with "Precursor.Quantity":
    protein.groups_Nonormalised <- diann_maxlfq(df[df$Q.Value <= 0.01 & df$PG.Q.Value <= 0.01,],
    group.header="Protein.Group",
    id.header = "Precursor.Id",
    quantity.header = "Precursor.Quantity")

    Actually, I found in issue Question about the filter steps between the main report and the matrix in DIANN 1.9 #1056 "diann_maxlfq implements a simple MaxLFQ algorithmdifferent from what DIA-NN uses internally", so I did get a different "protein.groups" result compared with "report.pg_matrix". But how I could use for futher pg quantification without normalised?
  2. I found that in previous issue, you suggest a MBR and normalize in searching. Are they compatible with enrich experiment? Or do I need close one of them in enrich proteome?
  3. note about imputation "when a protein is completely absent in some of the biological conditions), we prefer to perform it on the protein level". So how I could process the NA value? calulate average or median among valid values? or like LFQ, imputation followed by filter on valid values?
    sincerely,
    Zoe
@wangrui85 wangrui85 changed the title AP experiment with DIA quntification IP experiment with DIA quntification Aug 19, 2024
@vdemichev
Copy link
Owner

Hi Zoe,

could I replace "Precursor.Normalised" with "Precursor.Quantity"

Yes.

But how I could use for futher pg quantification without normalised?

As you've indicated above or by disabling normalisation in the DIA-NN GUI. For AP-MS disabling normalisation makes sense.

MBR and normalize in searching

Not sure what you are referring to. For AP-MS, it definitely makes sense to (i) use MBR, (ii) disable normalisation in DIA-NN GUI.

So how I could process the NA value?

In AP-MS you probably don't want to impute at all. But if you need to, because you'd like to use some downstream processing that requires complete profiles, minimal-value imputation on the protein level makes sense.

Best,
Vadim

@wangrui85
Copy link
Author

wangrui85 commented Aug 19, 2024 via email

@vdemichev
Copy link
Owner

Hi Zoe,

Does it mean that maxLFQ from “pg.maxtri.tsv” could be used directly under disabling normalize?

Yes, although you might want to add --matrix-spec-q 0.01 in this case to Additional options, if you want to use the pg_matrix.

To reproduce pg_matrix from the main report you need to apply filtering as described in https://github.com/vdemichev/DiaNN?tab=readme-ov-file#output and then transform the dataframe from long to wide format (e.g. using diann_matrix).

Best,
Vadim

@zoe1985
Copy link

zoe1985 commented Aug 28, 2024

Hi Zoe,

Does it mean that maxLFQ from “pg.maxtri.tsv” could be used directly under disabling normalize?

Yes, although you might want to add --matrix-spec-q 0.01 in this case to Additional options, if you want to use the pg_matrix.

To reproduce pg_matrix from the main report you need to apply filtering as described in https://github.com/vdemichev/DiaNN?tab=readme-ov-file#output and then transform the dataframe from long to wide format (e.g. using diann_matrix).

Best, Vadim

Hi,Vadim,

When I read “report.parquet" in R (although TAD could also do it), It's alwasys failed while process_long format:**

df<-read_parquet("report.parquet") ###ok
process_long_format("report.parquet", output_filename = "report-pg-global.tsv",
sample_id = "Run",
primary_id = "Protein.Group",
secondary_id = "Precursor.Id",
intensity_col = "Precursor.Quantity",
annotation_col = c("Protein.Names", "Genes"),
filter_double_less = c("Q.Value" = "0.01",
"Lib.PG.Q.Value" = "0.01")) #### failed

Sincerely,

Rui

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants