GGPM - GraphNN Generation of Photovoltaic Molecules
For easy, run
conda env create _ENV_NAME_ -f chem_env.yml
python3 get_vocab.py --data path/to/training_set.csv --output path/to/save/vocabs.txt
# Training Set
python3 preprocess.py --train path/to/training_set.csv --vocab path/to/save/vocabs.txt --batch-size 20 --ncpu 1
# Validation Set
python3 preprocess.py --train path/to/validation_set.csv --vocab path/to/save/vocabs.txt --batch-size 20 --ncpu 1
python3 vae_train.py --path-to-config path/to/pretrain_configs.json
# Fine-tuning methods: early-stopping & uncertainty-loss-scaling
python3 vae_fine_tune.py --path-to-config path/to/fine_tune_configs.json
# Fine-tuning: individual optimizers for each subnetwork
python3 vae_fine_tune_indv_opt.py --path-to-config path/to/fine_tune_configs.json
python3 reconstruct.py --model model_type --path-to-config path/to/reconstruction_configs.json
python3 optimizer.py --model model_type --path-to-config path/to/configs.json --optimize-type type --output output/file/name.csv --optim-step max_decoding_steps --latent-lr LR_to_update_gradients --delta improvement_threshold --threshold mse_gap_threshold --patience no_improvement_patience
NOTICE
- Used the fragment convertor from hg2g
- Skip molecules not read by rdkit
- Skip molecules w/ *
- All SMILES containing * asterisks are removed from training data