Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
smarts_pattern.tsv		smarts_pattern.tsv

Repository files navigation

ProtoEnumerator

The implementation of the Microstate Enumerator in the paper Uni-pKa: An Accurate and Physically Consistent pKa Prediction through Protonation Ensemble Modeling.

Introduction

It uses iterated template-matching algorithm to enumerate all the microstates in adjacent macrostates of a molecule's protonation ensemble from at least one microstate stored as SMILES.

The protonation template smarts_pattern.tsv modifies and augments the one in the paper MolGpka: A Web Server for Small Molecule pKa Prediction Using a Graph-Convolutional Neural Network and its open source implementation (MIT license) in the Github repository MolGpKa.

Usage

main.py is to reconstruct a plain pKa dataset to the Uni-pKa standard format with fully enumerated microstates.

python main.py enum -i <input> -o <output> -m <mode>

The recommended environment is

python = 3.8.13
rdkit = 2021.09.5
numpy = 1.20.3
pandas = 1.5.2

The <input> dataset is assumed be a csv-like file with a column storing SMILES. There are two cases allowed for each entry in the dataset.

It contains only one SMILES. The Enumerator helps to build the protonated/deprotonated macrostate and complete the original macrostate.
- When <mode> is "A", it will be considered as an acid (thrown into A pool).
- When <mode> is "B", it will be considered as a base (thrown into B pool).
It contains a string like "A1,...,Am>>B1,...Bn", where A1,...,Am are comma-separated SMILES of microstates in the acid macrostate (all thrown into A pool), and B1,...,Bn are comma-separated SMILES of microstates in the base macrostate(all thrown into B pool). The Enumerator helps to complete the both.

The <mode> "A" (default) or "B" determines which pool (A/B) is the reference structures and the starting point of the enumeration.

The <output> dataset is then constructed after the enumeration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProtoEnumerator

Introduction

Usage

About

Releases

Packages

Contributors 2

Languages

License

dptech-corp/Uni-pKa

Folders and files

Latest commit

History

Repository files navigation

ProtoEnumerator

Introduction

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages