Conditional Molecule Generator

This repository contains the source code and data sets for the graph based molecule generator discussed in the article "Multi-Objective De Novo Drug Design with Conditional Graph Generative Model" (https://arxiv.org/abs/1801.07299).

Briefly speaking, we used conditional graph convolution to structure the generative model. The properties of output molecules can then be controlled using the conditional code.

Note

We have made several updates compared to the previous version:

The model is now implemented using MXNet. We are also working on the Tensorflow and pytorch version of our model. Please checkout the branches torch and tf if you are interested.
A new graph generative model with molecule level recurrency is added to the repo. See the article for further detail.
The pre-trained models are now available in ckpt.tar.gz (download here), along with the predictive model for GSK-3b and JNK3.
Samples generated by unconditional model are now available in samples.tar.gz (download here)
All datasets are now packed in datasets.tar.gz (download here)
Large files (ckpt.tar.gz, samples.tar.gz and datasets.tar.gz ) are now placed in the assets in the release 1.0.
We have provided a tutorial (examples.ipynb) to demonstrate the usage of our model

Requirements

This repo is built using Python 2.7, and utilizes the following packages:

MXNet == 1.1.0
RDKit == 2017.03.3
Numpy == 1.13.3
Scikit-learn== 0.19.1 (for the predictive model)

To ease the installation process, a docker environment will be added to the repo in future release.

Quick start

Project structure

train.py: main training script.
mx_mg: package for the molecule generative model:
- data: packages for data processing workflows:
  - conditionals.py: callables used to generate the conditional codes for molecules
  - data_struct.py: defines atom types and bond types
  - dataloaders.py , datasets.py and samplers.py: data loading logics
  - utils.py: utility functions
- models: library for graph generative models
  - modules.py: define modules (or blocks) such as graph convolution
  - networks.py: define networks (MolMP, MolRNN and CMolRNN)
  - functions.py: autograd.Function objects and operations
- builders.py: utilities for building molecules using generative models
rdkit_contrib: functions used to calculate QED and SAscore (for older version of rdkit)
example.ipynb: tutorial

Usage

To train the model, first unpackdatasets.tar.gz (download here) to the current directory, and call:

./train.py {molmp|molrnn|scaffold|prop|kinase} path/to/output

Where {molmp|molrnn|scaffold|prop|kinase} are model types, and path/to/output is the directory where you want to save the model's checkpoint file and log files. The following call:

./train.py {molmp|molrnn|scaffold|prop|kinase} -h

gives help for each model type.

Todo list

Docker environment

For any questions | problems | criticisms | ...

Please contact me. Email: [email protected] or [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
img		img
mx_mg		mx_mg
rdkit_contrib		rdkit_contrib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
examples.ipynb		examples.ipynb
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conditional Molecule Generator

Note

Requirements

Quick start

Project structure

Usage

Todo list

For any questions | problems | criticisms | ...

About

Releases

Packages

Languages

License

wllllg/molecule_generator

Folders and files

Latest commit

History

Repository files navigation

Conditional Molecule Generator

Note

Requirements

Quick start

Project structure

Usage

Todo list

For any questions | problems | criticisms | ...

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages