Skip to content

Commit

Permalink
[Model Zoo] GAT on Tox21 (dmlc#793)
Browse files Browse the repository at this point in the history
* GAT

* Fix mistake

* Fix

* hotfix

* Fix

* Fix

* Fix

* Fix

* Fix

* Fix

* Fix

* Update

* Update

* Update

* Fix style

* Hotfix

* Hotfix

* Hotfix

* Fix

* Fix

* Update

* CI trial

* Update

* Update

* Update
  • Loading branch information
mufeili authored Aug 27, 2019
1 parent ad15947 commit e590fee
Show file tree
Hide file tree
Showing 18 changed files with 1,038 additions and 536 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ Contributors
* [Tianyi Zhang](https://github.com/Tiiiger): SGC in Pytorch
* [Jun Chen](https://github.com/kitaev-chen): GIN in Pytorch
* [Aymen Waheb](https://github.com/aymenwah): APPNP in Pytorch
* [Chengqiang Lu](https://github.com/geekinglcq): MGCN, SchNet and MPNN in PyTorch

Other improvement
* [Brett Koonce](https://github.com/brettkoonce)
* [@giuseppefutia](https://github.com/giuseppefutia)
* [@mori97](https://github.com/mori97)
* Hao Jin

59 changes: 39 additions & 20 deletions examples/pytorch/model_zoo/chem/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# DGL for Chemistry

With atoms being nodes and bonds being edges, molecular graphs are among the core objects for study in drug discovery.
As drug discovery is known to be costly and time consuming, deep learning on graphs can be potentially beneficial for
improving the efficiency of drug discovery [1], [2], [9].
With atoms being nodes and bonds being edges, molecular graphs are among the core objects for study in Chemistry.
Deep learning on graphs can be beneficial for various applications in Chemistry like drug and material discovery
[1], [2], [12].

To make it easy for domain scientists, the DGL team releases a model zoo for Chemistry, focusing on two particular cases
-- property prediction and target generation/optimization.
Expand All @@ -14,7 +14,7 @@ the chemistry community and the deep learning community to further their researc

Before you proceed, make sure you have installed the dependencies below:
- PyTorch 1.2
- Check the [official website](https://pytorch.org/) for installation guide
- Check the [official website](https://pytorch.org/) for installation guide.
- RDKit 2018.09.3
- We recommend installation with `conda install -c conda-forge rdkit==2018.09.3`. For other installation recipes,
see the [official documentation](https://www.rdkit.org/docs/Install.html).
Expand All @@ -39,17 +39,22 @@ mostly developed based on molecule fingerprints.
Graph neural networks make it possible for a data-driven representation of molecules out of the atoms, bonds and
molecular graph topology, which may be viewed as a learned fingerprint [3].

### Models

- **Graph Convolutional Network**: Graph Convolutional Networks (GCN) have been one of the most popular graph neural
networks and they can be easily extended for graph level prediction.
- **SchNet**: SchNet is a novel deep learning architecture modeling quantum interactions in molecules which utilize
the continuous-filter convolutional layers [4].
- **Multilevel Graph Convolutional neural Network**: Multilevel Graph Convolutional neural Network (MGCN) is a
well-designed hierarchical graph neural network directly extracts features from the conformation and spatial information
followed by the multilevel interactions [5].
- **Message Passing Neural Network**: Message Passing Neural Network (MPNN) is a well-designed network with edge network
(enn) as front end and uses Set2Set to output prediction [6].
### Models
- **Graph Convolutional Networks** [3], [9]: Graph Convolutional Networks (GCN) have been one of the most popular graph
neural networks and they can be easily extended for graph level prediction.
- **Graph Attention Networks** [10]: Graph Attention Networks (GATs) incorporate multi-head attention into GCNs,
explicitly modeling the interactions between adjacent atoms.
- **SchNet** [4]: SchNet is a novel deep learning architecture modeling quantum interactions in molecules which utilize
the continuous-filter convolutional layers.
- **Multilevel Graph Convolutional neural Network** [5]: Multilevel Graph Convolutional neural Network (MGCN) is a well-designed
hierarchical graph neural network directly extracts features from the conformation and spatial information followed
by the multilevel interactions.
- **Message Passing Neural Network** [6]: Message Passing Neural Network (MPNN) is a well-designed network with edge network (enn)
as front end and Set2Set for output prediction.

### Example Usage of Pre-trained Models

![](https://s3.us-east-2.amazonaws.com/dgl.ai/model_zoo/drug_discovery/gcn_model_zoo_example.png)

## Generative Models

Expand All @@ -66,8 +71,14 @@ Generative models are known to be difficult for evaluation. [GuacaMol](https://g
are also two accompanying review papers that are well written [7], [8].

### Models
- **Deep Generative Models of Graphs (DGMG)**: A very general framework for graph distribution learning by progressively
adding atoms and bonds.
- **Deep Generative Models of Graphs (DGMG)** [11]: A very general framework for graph distribution learning by
progressively adding atoms and bonds.

### Example Usage of Pre-trained Models

![](https://s3.us-east-2.amazonaws.com/dgl.ai/model_zoo/drug_discovery/dgmg_model_zoo_example1.png)

![](https://s3.us-east-2.amazonaws.com/dgl.ai/model_zoo/drug_discovery/dgmg_model_zoo_example2.png)

## References

Expand All @@ -85,12 +96,20 @@ information processing systems (NeurIPS)*, 2224-2232.
[5] Lu et al. Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective.
*The 33rd AAAI Conference on Artificial Intelligence*.

[6] Gilmer et al. (2017) Neural Message Passing for Quantum Chemistry. *Proceedings of the 34th International Conference
on Machine Learning* JMLR. 1263-1272.
[6] Gilmer et al. (2017) Neural Message Passing for Quantum Chemistry. *Proceedings of the 34th International Conference on
Machine Learning* JMLR. 1263-1272.

[7] Brown et al. (2019) GuacaMol: Benchmarking Models for de Novo Molecular Design. *J. Chem. Inf. Model*, 2019, 59, 3,
1096-1108.

[8] Polykovskiy et al. (2019) Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models. *arXiv*.

[9] Goh et al. (2017) Deep learning for computational chemistry. *Journal of Computational Chemistry* 16, 1291-1307.
[9] Kipf et al. (2017) Semi-Supervised Classification with Graph Convolutional Networks.
*The International Conference on Learning Representations (ICLR)*.

[10] Veličković et al. (2018) Graph Attention Networks.
*The International Conference on Learning Representations (ICLR)*.

[11] Li et al. (2018) Learning Deep Generative Models of Graphs. *arXiv preprint arXiv:1803.03324*.

[12] Goh et al. (2017) Deep learning for computational chemistry. *Journal of Computational Chemistry* 16, 1291-1307.
62 changes: 42 additions & 20 deletions examples/pytorch/model_zoo/chem/property_prediction/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,21 @@ stress response pathways. Each target yields a binary prediction problem. Molecu
into training, validation and test set with a 80/10/10 ratio. By default we follow their split method.

### Models
- **Graph Convolutional Network** [2]. Graph Convolutional Networks (GCN) have been one of the most popular graph neural
- **Graph Convolutional Network** [2], [3]. Graph Convolutional Networks (GCN) have been one of the most popular graph neural
networks and they can be easily extended for graph level prediction. MoleculeNet [1] reports baseline results of graph
convolutions over multiple datasets.
- **Graph Attention Networks** [7]: Graph Attention Networks (GATs) incorporate multi-head attention into GCNs,
explicitly modeling the interactions between adjacent atoms.

### Usage

To train a model from scratch, simply call `python classification.py`. To skip training and use the pre-trained model,
call `python classification.py -p`.
Use `classification.py` with arguments
```
-m {GCN, GAT}, MODEL, model to use
-d {Tox21}, DATASET, dataset to use
```

If you want to use the pre-trained model, simply add `-p`.

We use GPU whenever it is available.

Expand All @@ -31,10 +38,16 @@ We use GPU whenever it is available.
| ---------------- | ---------------------- |
| MoleculeNet [1] | 0.829 |
| [DeepChem example](https://github.com/deepchem/deepchem/blob/master/examples/tox21/tox21_tensorgraph_graph_conv.py) | 0.813 |
| Pretrained model | 0.827 |
| Pretrained model | 0.826 |

Note that the dataset is randomly split so these numbers are only for reference and they do not necessarily suggest
a real difference.

#### GAT on Tox21

Note that due to some possible randomness you may get different numbers for DeepChem example and our model. To get
match exact results for this model, please use the pre-trained model as in the usage section.
| Source | Averaged ROC-AUC Score |
| ---------------- | ---------------------- |
| Pretrained model | 0.827 |

## Dataset Customization

Expand All @@ -47,16 +60,20 @@ Regression tasks require assigning continuous labels to a molecule, e.g. molecul

### Dataset

- **Alchemy**. The [Alchemy Dataset](https://alchemy.tencent.com/) is introduced by Tencent Quantum Lab to facilitate the development of new machine learning models useful for chemistry and materials science.
The dataset lists 12 quantum mechanical properties of 130,000+ organic molecules comprising up to 12 heavy atoms (C, N, O, S, F and Cl), sampled from the [GDBMedChem](http://gdb.unibe.ch/downloads/) database.
These properties have been calculated using the open-source computational chemistry program Python-based Simulation of Chemistry Framework ([PySCF](https://github.com/pyscf/pyscf)).
The Alchemy dataset expands on the volume and diversity of existing molecular datasets such as QM9.
- **Alchemy**. The [Alchemy Dataset](https://alchemy.tencent.com/) is introduced by Tencent Quantum Lab to facilitate the development of new
machine learning models useful for chemistry and materials science. The dataset lists 12 quantum mechanical properties of 130,000+ organic
molecules comprising up to 12 heavy atoms (C, N, O, S, F and Cl), sampled from the [GDBMedChem](http://gdb.unibe.ch/downloads/) database.
These properties have been calculated using the open-source computational chemistry program Python-based Simulation of Chemistry Framework
([PySCF](https://github.com/pyscf/pyscf)). The Alchemy dataset expands on the volume and diversity of existing molecular datasets such as QM9.

### Models

- **SchNet**: SchNet is a novel deep learning architecture modeling quantum interactions in molecules which utilize the continuous-filter convolutional layers [3].
- **Multilevel Graph Convolutional neural Network**: Multilevel Graph Convolutional neural Network (MGCN) is a well-designed hierarchical graph neural network directly extracts features from the conformation and spatial information followed by the multilevel interactions [4].
- **Message Passing Neural Network**: Message Passing Neural Network (MPNN) is a well-designed network with edge network (enn) as front end and us Set2Set to output prediction [5].
- **SchNet**: SchNet is a novel deep learning architecture modeling quantum interactions in molecules which utilize the continuous-filter
convolutional layers [4].
- **Multilevel Graph Convolutional neural Network**: Multilevel Graph Convolutional neural Network (MGCN) is a hierarchical
graph neural network directly extracts features from the conformation and spatial information followed by the multilevel interactions [5].
- **Message Passing Neural Network**: Message Passing Neural Network (MPNN) is a network with edge network (enn) as front end
and Set2Set for output prediction [6].

### Usage

Expand All @@ -71,22 +88,27 @@ The model option must be one of 'sch', 'mgcn' or 'mpnn'.

|Model |Mean Absolute Error (MAE)|
|-------------|-------------------------|
|SchNet[3] |0.065|
|MGCN[4] |0.050|
|MPNN[5] |0.056|
|SchNet[4] |0.065|
|MGCN[5] |0.050|
|MPNN[6] |0.056|

## References
[1] Wu et al. (2017) MoleculeNet: a benchmark for molecular machine learning. *Chemical Science* 9, 513-530.

[2] Kipf et al. (2017) Semi-Supervised Classification with Graph Convolutional Networks.
[2] Duvenaud et al. (2015) Convolutional networks on graphs for learning molecular fingerprints. *Advances in neural
information processing systems (NeurIPS)*, 2224-2232.

[3] Kipf et al. (2017) Semi-Supervised Classification with Graph Convolutional Networks.
*The International Conference on Learning Representations (ICLR)*.

[3] Schütt et al. (2017) SchNet: A continuous-filter convolutional neural network for modeling quantum interactions.
[4] Schütt et al. (2017) SchNet: A continuous-filter convolutional neural network for modeling quantum interactions.
*Advances in Neural Information Processing Systems (NeurIPS)*, 992-1002.

[4] Lu et al. (2019) Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective.
[5] Lu et al. (2019) Molecular Property Prediction: A Multilevel Quantum Interactions Modeling Perspective.
*The 33rd AAAI Conference on Artificial Intelligence*.

[5] Gilmer et al. (2017) Neural Message Passing for Quantum Chemistry. *Proceedings of the 34th International Conference on
[6] Gilmer et al. (2017) Neural Message Passing for Quantum Chemistry. *Proceedings of the 34th International Conference on
Machine Learning*, JMLR. 1263-1272.

[7] Veličković et al. (2018) Graph Attention Networks.
*The International Conference on Learning Representations (ICLR)*.
Loading

0 comments on commit e590fee

Please sign in to comment.