Skip to content

Creates a custom PGN in silico MS2 spectra database

License

Notifications You must be signed in to change notification settings

jerickwan/PGN_MS2

Repository files navigation

PGN_MS2: In silico MS/MS prediction for peptidoglycan profiling

Introduction

PGN_MS2 is a computational tool that generates a customizable peptidoglycan (PGN) database from user-defined parameters. Furthermore, it can simulate MS/MS spectra for each PGN and compile these predicted MS/MS spectra to a spectral library in the NIST format (.msp). The spectral library (.msp) is compatible with open-access and vendor software, e.g. MS-DIAL, for automated matching and scoring of experimental MS/MS peaks, facilitating automated PGN identification. Read the open access paper here.

Summary

Workflow

Installation

PGN_MS2 is written in Python 3.9 and uses RDKit to manipulate molecules. A graphical user interface (built with easygui) is available. The following Python packages are required:

rdkit
pandas
numpy
yaml
joblib
easygui

The Python environment can be created with Conda using the following command:

conda env create -f /PATH/OF/PGN_MS2/environment.yaml

Running PGN_MS2

GUI

The GUI of PGN_MS2 can be run from command line with the following command:

python /PATH/OF/PGN_MS2/UserInterface.py

A more detailed user guide for the GUI can be found here.

Running from IDE

Alternatively, PGN_MS2 can be run with an IDE.* Sample code is provided with ManualRun.py

*Spyder 3.9 is not compatible with RDKit and must be ran from a separate environment. See Spyder's FAQ for more information.

Output

Output is stored in /output. Each file is named with a prefix comprising the starting datetime and a user-given name (e.g. 20240605_Ecoli). The various outputs are divided among the three subfolders as such:

Subfolder Filename Description
compounds [prefix].xlsx MS1 database in spreadsheet format. Monomers, dimers and trimers are shown on separate sheets.
[prefix].pickle MS1 Database in pickle format.
[prefix].yaml User-defined settings saved in yaml format.
[prefix]_graphical_summary.svg Graphical summary of settings used to generate the PGN library.
msp [prefix].msp MS2 database. Different adduct forms are given as separate entries.
[prefix]_[number].pickle MS2 Database in pickle format. Saved in batches of 5,000 compounds, which is indicated by [number].
peaklists [prefix]_spectradata.xlsx MS2 database in spreadsheet format. Each batch has its own sheet. Each compound is presented as its own table containing the top 200 most intense ions.
[prefix]_iondata.xlsx All ions and their respective structures are tabulated in this file.

Supported PGN Chemotypes

PGN_MS2 imports chemical information from an internal library located at:

data/PGN.xlsx

PGN_MS2 was designed to accomodate most PGN chemotypes. It is able to generate PGN with:

  • modified glycans: acetylation (increase/decrease), glycolylation (anMurNGlyc) and dehydration (anMurNAc).
  • stem peptide sequences up to eight amino acids long. Supported amino acids include the canonical amino acids as well as non-canonical amino acids commonly found in PGN (mDAP, Orn, γ-isoGln).
  • bridge peptides (i.e. branch peptides, side chains) that are attached to either diamino/dicarboxy amino acids in the stem peptide.
  • a wide variety of modifications such as lactamization, endopeptidase digestion.
  • two different polymerisation modes: either through glycosidic bonds or peptide bonds.

Chemotypes

Misc / Other Links

This tool was built by members of Qiao Lab. MS/MS spectra for all identified PGN from the a/m paper is also available as a download on MoNA. PGN_MS2 was used in combination with MS-DIAL, an open source MS analysis software, available here.

The following can be found in the Supplementary Information of our paper:

  • Nomenclature (Table S1)
  • Overview of GUI (Table S2)

Read the open access paper here.

About

Creates a custom PGN in silico MS2 spectra database

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages