Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
figures		figures
models		models
reinvent_chemistry-0.0.51.dist-info		reinvent_chemistry-0.0.51.dist-info
reinvent_chemistry		reinvent_chemistry
reinvent_scoring-0.0.73.dist-info		reinvent_scoring-0.0.73.dist-info
reinvent_scoring		reinvent_scoring
resources		resources
validation		validation
.gitignore		.gitignore
LICENSE		LICENSE
MUBD3.0.yml		MUBD3.0.yml
README.md		README.md
basic_validation.ipynb		basic_validation.ipynb
build_uds.sh		build_uds.sh
build_uls.py		build_uls.py
curing_clustering.py		curing_clustering.py
gen_decoys.sh		gen_decoys.sh
mk_config.py		mk_config.py
pool_decoys.py		pool_decoys.py

Repository files navigation

MUBD-DecoyMaker 3.0: Making Maximal Unbiased Benchmarking Data Sets with Deep Reinforcement Learning

Introduction

MUBD-DecoyMaker 3.0 is a brand-new computational software to make Maximal Unbiased Benchmarking Data Sets (MUBD) for in silico screening. Compared with our earlier two versions, i.e. MUBD-DECOYMAKER (Pipeline Pilot-based version, or MUBD-DecoyMaker 1.0) and MUBD-DecoyMaker 2.0, MUBD-DecoyMaker 3.0 has two noteworthy features:

Virtual molecules generated by recurrent neural netwrok (RNN)-based molecular generator with reinforcement learning (RL), instead of chemical library molecules, constitue the unbiased decoy set (UDS) component of MUBD.
The criteria (or rule) for an ideal decoy previously defined in the earlier versions are integrated into a new scoring function for RL to fine-tune the generator.

Below is how to implement and run MUBD-DecoyMaker 3.0.

Requirements

As REINVENT is used to make virtual decoys of MUBD 3.0, users are required to install this tool as instructed. The corresponding conda environment named reinvent.v3.2 is created for virtual decoy generation. Please note we have modified the PyPI packages reinvent_chemistry and reinvent_scoring here in order to include our scoring functions specific for MUBD. Another conda environment named MUBD3.0 is also created for preprocessing and postprocessing.

Install REINVENT.
Clone this repository and navigate to it:

$ git clone https://github.com/Sooooooap/MUBD3.0.git
$ cd MUBD3.0

Replace the packages reinvent_chemistry and reinvent_scoring with our modified ones:

$ conda activate reinvent.v3.2 
$ pip show reinvent_chemistry # Location: ~/anaconda3/envs/reinvent.v3.2/lib/python3.7/site-packages
$ cp -r reinvent_chemistry/ reinvent_scoring/ ~/anaconda3/envs/reinvent.v3.2/lib/python3.7/site-packages

Create the conda environment called MUBD3.0:

$ conda env create -f MUBD3.0.yml

Usage

ACM Agonists is used as a test case to demonstrate how to build MUBD with MUBD-DecoyMaker 3.0. All the test files are in the directory of resources.

Build unbiased ligand set (ULS)

Run build_uls.py to process the raw ligand set. This script takes the raw ligands in SMILES representation as input (raw_actives.smi) and puts out the unbiased ligand set (Diverse_ligands.csv). Four files regarding ligand properties, i.e. Diverse_ligands_PS.csv, Diverse_ligands_PS_maxmin.csv, Diverse_ligands_sims_maxmin.txt and Diverse_ligands_len.txt, are also generated.

IMPORTANT: Ligand curation, including molecule standardization, salt removal and protonization (charge) at a specific range of pH (implemented by Dimorphite-DL, is required if the ligands are not curated. For ligand curation, we provide the --cure option in build_uls.py. Please note the raw ligands in this test case are curated. Also, the users may use --help option to see all the available options.

$ conda activate MUBD3.0
(MUBD3.0) $ python build_uls.py

Generate potential decoy set

mk_config.py writes out the configuration for MUBD 3.0 virtual decoy generation. In order to automatically set up the configuration for each ligand and proceed to the next ligand, we provide gen_decoys.sh. Please replace the </path/to/REINVENT> and </path/to/MUBD3.0> in the scripts with user-defined directories.

$ mkdir output
$ chmod +x ./gen_decoys.sh
$ conda activate reinvent.v3.2
(reinvent.v3.2) $ ./gen_decoys.sh

Build unbiased decoy set (UDS)

The file output/ligand_$idx/results/scaffold_memory.csv contains the potential decoy set for ligand_$idx. To get the unbiased decoy set Final_decoys.csv, potential decoys are refined by SMILES curation, structural clustering and pooling all decoys annotated with their property profiles. We provide build_uds.sh which automatically runs curing_clustering.py and pool_decoys.py as the realization.

$ chmod +x ./build_uds.sh
$ conda activate MUBD3.0
(MUBD3.0) $ ./build_uds.sh

Validation

Basically, the MUBD is validated and measured with four metrics. Please go through the notebook basic_validation.ipynb for more details.

$ conda activate MUBD3.0
(MUBD3.0) $ jupyter notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MUBD-DecoyMaker 3.0: Making Maximal Unbiased Benchmarking Data Sets with Deep Reinforcement Learning

Introduction

Requirements

Usage

Build unbiased ligand set (ULS)

Generate potential decoy set

Build unbiased decoy set (UDS)

Validation

About

Releases

Packages

Contributors 2

Languages

License

taoshen99/MUBDsyn

Folders and files

Latest commit

History

Repository files navigation

MUBD-DecoyMaker 3.0: Making Maximal Unbiased Benchmarking Data Sets with Deep Reinforcement Learning

Introduction

Requirements

Usage

Build unbiased ligand set (ULS)

Generate potential decoy set

Build unbiased decoy set (UDS)

Validation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages