Skip to content

Commit

Permalink
## Visualising Molecular Graphs : now showing first and last five mol…
Browse files Browse the repository at this point in the history
…ecules from all three datasets
  • Loading branch information
smg3d committed Aug 19, 2024
1 parent 04bf4c1 commit f7cf648
Showing 1 changed file with 47 additions and 6 deletions.
53 changes: 47 additions & 6 deletions geometric_gnn_101.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -562,9 +562,9 @@
"source": [
"## Visualising Molecular Graphs\n",
"\n",
"To get a better understanding of how the QM9 molecular graphs look like, let's visualise a few samples from the training set along with their corresponding target (their dipole moment).\n",
"To get a better understanding of how the QM9 molecular graphs look like, let's visualise a few samples from the three sets along with their corresponding target (their normalized electric dipole moment).\n",
"\n",
"In the following plot we visualise **sparse graphs** where edges represent physical connections (i.e. bonds). In this practical, however, we will use **fully-connected graphs** and encode the graph structure in the attributes of each. Later in this practical, we will study the advantages and downsides of both approaches.\n",
"In the following plots we visualise **sparse graphs** where edges represent physical connections (i.e. bonds). In this practical, however, we will use **fully-connected graphs** and encode the graph structure in the attributes of each. Later in this practical, we will study the advantages and downsides of both approaches.\n",
"\n",
"**❗️Note:** we have implemented some code for you to convert the PyG graph into a Molecule object that can be used by RDKit, a python package for chemistry and visualing molecules. It is not important for you to understand RDKit beyond visualisation purposes."
]
Expand All @@ -577,11 +577,52 @@
},
"outputs": [],
"source": [
"num_viz = 50\n",
"mols = [to_rdkit(train_dataset[i]) for i in range(num_viz)]\n",
"values = [str(round(float(train_dataset[i].y), 3)) for i in range(num_viz)]\n",
"num_viz = 10\n",
"draw_range_train = list(range(0, 5)) + list(range(len(train_dataset)-5, len(train_dataset)))\n",
"draw_range_val = list(range(0, 5)) + list(range(len(val_dataset)-5, len(val_dataset)))\n",
"draw_range_test = list(range(0, 5)) + list(range(len(test_dataset)-5, len(test_dataset)))\n",
"\n",
"Chem.Draw.MolsToGridImage(mols, legends=[f\"y = {value}\" for value in values], molsPerRow=5)"
"print(\"First and last five molecules in the TRAINING dataset\")\n",
"print(\" with normalized electric dipole moment (original, non-normalized, value in parenthesis)\")\n",
"mols = [to_rdkit(train_dataset[i]) for i in draw_range_train]\n",
"values_norm = [str(round(float(train_dataset[i].y), 3)) for i in range(num_viz)]\n",
"values_denorm = [str(round(float(train_dataset[i].y * std + mean), 3)) for i in draw_range_train]\n",
"legs = [str(round(float(train_dataset[i].y), 3)) + \" (\" +\n",
" str(round(float(train_dataset[i].y * std + mean), 3)) + \")\" for i in range(num_viz)]\n",
"Chem.Draw.MolsToGridImage(mols, legends=legs, molsPerRow=5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"First and last five molecules in the VALIDATION dataset\")\n",
"print(\" with normalized electric dipole moment (original, non-normalized, value in parenthesis)\")\n",
"mols = [to_rdkit(val_dataset[i]) for i in draw_range_val]\n",
"values_norm = [str(round(float(val_dataset[i].y), 3)) for i in range(num_viz)]\n",
"values_denorm = [str(round(float(val_dataset[i].y * std + mean), 3)) for i in draw_range_val]\n",
"values = [str(round(float(val_dataset[i].y), 3)) for i in range(num_viz)]\n",
"legs = [str(round(float(val_dataset[i].y), 3)) + \" (\" +\n",
" str(round(float(val_dataset[i].y * std + mean), 3)) + \")\" for i in range(num_viz)]\n",
"Chem.Draw.MolsToGridImage(mols, legends=legs, molsPerRow=5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"First and last five molecules in the TESTING dataset\")\n",
"print(\" with normalized electric dipole moment (original, non-normalized, value in parenthesis)\")\n",
"mols = [to_rdkit(test_dataset[i]) for i in draw_range_test]\n",
"values_norm = [str(round(float(test_dataset[i].y), 3)) for i in range(num_viz)]\n",
"values_denorm = [str(round(float(test_dataset[i].y * std + mean), 3)) for i in draw_range_test]\n",
"legs = [str(round(float(test_dataset[i].y), 3)) + \" (\" +\n",
" str(round(float(test_dataset[i].y * std + mean), 3)) + \")\" for i in range(num_viz)]\n",
"Chem.Draw.MolsToGridImage(mols, legends=legs, molsPerRow=5)"
]
},
{
Expand Down

0 comments on commit f7cf648

Please sign in to comment.