[Paper
]
[Project Page
]
Code release for the paper Multi-modal conditional diffusion model using signed distance functions for metal-organic frameworks generation
MOFFUSION is a multi-modal conditional diffusion model for MOF generation. MOFFUSION showed exceptional generation performance compared to baseline models in terms of structure validity and property statistics. Diverse modalities of data, including numeric, categorical, text, and their combinations, were successfully handled for the conditional generation of MOFs. Notably, signed distance functions (SDFs) were used for the input representation of MOFs, marking their first implementation in the generation of porous materials (below). Please visit Project Page for more details.
We recommend to build a conda
environment. You might need a different version of cudatoolkit
depending on your GPU driver.
conda create -n moffusion python=3.9.18 -y && conda activate moffusion
conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
conda install -y -c conda-forge cudatoolkit-dev # this might take some time
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d
pip install h5py joblib termcolor scipy einops tqdm matplotlib opencv-python PyMCubes imageio trimesh omegaconf tensorboard notebook Pillow==9.5.0 py3Dmol ipywidgets transformers pormake seaborn
pip install -U scikit-learn
First create a folder ./saved_ckpt
to save the pre-trained weights. Then download the pre-trained weights from the provided links and put them in the ./saved_ckpt
folder.
mkdir saved_ckpt # skip if there already exists
# VQVAE's checkpoint
wget https://figshare.com/ndownloader/files/46925977 -O saved_ckpt/vqvae.pth
# MOF Constructor's checkpoint
wget https://figshare.com/ndownloader/files/46925971 -O saved_ckpt/mof_constructor_topo.pth
wget https://figshare.com/ndownloader/files/46925974 -O saved_ckpt/mof_constructor_BB.pth
# MOFFUSION's checkpoint
## Unconditional model (uncond)
wget https://figshare.com/ndownloader/files/46931689 -O saved_ckpt/moffusion_uncond.pth
## Conditional models (topo, H2, text)
wget https://figshare.com/ndownloader/files/46926004 -O saved_ckpt/moffusion_topo.pth
wget https://figshare.com/ndownloader/files/46931701-O saved_ckpt/moffusion_H2.pth
wget https://figshare.com/ndownloader/files/46925995 -O saved_ckpt/moffusion_text.pth
Please check the provided jupyter notebooks for how to use the code. First open the jupyter notebook server.
jupyter notebook
Then, open one of the following notebooks for the task you want to perform.
- Unconditional generation:
demo_uncond.ipynb
- Conditional generation on topology:
demo_topo.ipynb
- Conditional generation on text:
demo_text.ipynb
- Conditional generation on hydrogen working capacity:
demo_H2.ipynb
- Pore crafting:
demo_pore_crafting.ipynb
Note that the notebooks will automatically save the generated shapes in the ./samples
folder.
For example, if you run demo_topo.ipynb
, the generated outputs will be saved in ./samples/Demo_topo
.
To utilized the generated structures for other purposes (e.g., molecular simulations), please perform an additional structure optimization process.
The uploaded version of MOFFUSION utilizes a classical conditional diffusion model for simplicity. However, it can be easily modified to use a classifier-free guidance approach.
(optional) We found that pormake software sometimes prints out an error message, but the structures are still successfully generated. However, if you want to silence the error message, please perfrom serialization as follow. You only need to perform this once, not for each demo.
serialize()
Example of generation for topology conditioning, with a target topology of 'pcb'
Please download the dataset from the following link
and place it under ./data/250k/
. (Caution! the file is big.)
Therefore, the SDF files with '.npy' format should be placed in ./data/250k/resolution_32/
.
- Train VQVAE
./launchers/train_vqvae.sh
#After training, copy the trained VQVAE checkpoint to the `./saved_ckpt` folder (or any other folders), and specify the path in the launcher file.
- Train MOF-Constructor (Optional)
We encourage users to use saved MOF-Constructor checkpoint files without needing to re-trian them.
However, if you want to re-train them, you can easily do it as all models are available in the repository.
- Train MOFFUSION (unconditional)
./launchers/train_moffusion_uncond.sh
- Train MOFFUSION conditioned on hydrogen working capacity
./launchers/train_moffusion_H2.sh
- Train MOFFUSION conditioned on topology
./launchers/train_moffusion_topo.sh
- Train MOFFUSION conditioned on text
./launchers/train_moffusion_text.sh
- Train MOFFUSION for multi-condioning
upcoming!
If you find this code helpful, please consider citing:
- Journal version
@inproceedings{,
author={Park, Junkil and Lee, Youhan and Kim, Jihan},
title={Multi-modal conditional diffusion model using signed distance functions for metal-organic frameworks generation},
Journal={Nature Communications},
year={2024},
}
- arxiv version
@article{,
author={Park, Junkil and Lee, Youhan and Kim, Jihan},
title={Multi-modal conditioning for metal-organic frameworks generation using 3D modeling techniques},
Journal={chemrxiv},
year={2024},
}
Coming soon!
This code borrows heavely from SDFUSION. The followings packages are required to compute the SDF: pymol, mesh-to-sdf.
This project was funded by National Research Foundation of Korea under grant No.RS-2024-00337004.
This project is licensed under the MIT License. Please check the LICENSE file for more information.