Created using Colaboratory

faisalmemon · Feb 1, 2024 · f4fa3e8 · f4fa3e8
1 parent 21cff37
commit f4fa3e8
Showing 1 changed file with 8 additions and 9 deletions.
diff --git a/Notebooks/Chap05/5_2_Binary_Cross_Entropy_Loss.ipynb b/Notebooks/Chap05/5_2_Binary_Cross_Entropy_Loss.ipynb
@@ -4,7 +4,6 @@
   "metadata": {
     "colab": {
       "provenance": [],
-      "authorship_tag": "ABX9TyOSb+W2AOFVQm8FZcHAb2Jq",
       "include_colab_link": true
     },
     "kernelspec": {
@@ -199,7 +198,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "The left is model output and the right is the model output after the sigmoid has been applied, so it now lies in the range [0,1] and represents the probability, that y=1.  The black dots show the training data.  We'll compute the the likelihood and the negative log likelihood."
+        "The left is model output and the right is the model output after the sigmoid has been applied, so it now lies in the range [0,1] and represents the probability, that y=1.  The black dots show the training data.  We'll compute the likelihood and the negative log likelihood."
       ],
       "metadata": {
         "id": "MvVX6tl9AEXF"
@@ -208,7 +207,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# Return probability under Bernoulli distribution for input x\n",
+        "# Return probability under Bernoulli distribution for observed class y\n",
         "def bernoulli_distribution(y, lambda_param):\n",
         "    # TODO-- write in the equation for the Bernoulli distribution\n",
         "    # Equation 5.17 from the notes (you will need np.power)\n",
@@ -269,7 +268,7 @@
       "source": [
         "# Let's test this\n",
         "beta_0, omega_0, beta_1, omega_1 = get_parameters()\n",
-        "# Use our neural network to predict the mean of the Gaussian\n",
+        "# Use our neural network to predict the Bernoulli parameter lambda\n",
         "model_out = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
         "lambda_train = sigmoid(model_out)\n",
         "# Compute the likelihood\n",
@@ -336,7 +335,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "Now let's investigate finding the maximum likelihood / minimum negative log likelihood solution.  For simplicity, we'll assume that all the parameters are fixed except one and look at how the likelihood and log likelihood change as we manipulate the last parameter.  We'll start with overall y_offset, beta_1 (formerly phi_0)"
+        "Now let's investigate finding the maximum likelihood / minimum negative log likelihood solution.  For simplicity, we'll assume that all the parameters are fixed except one and look at how the likelihood and negative log likelihood change as we manipulate the last parameter.  We'll start with overall y_offset, beta_1 (formerly phi_0)"
       ],
       "metadata": {
         "id": "OgcRojvPWh4V"
@@ -359,7 +358,7 @@
         "  # Run the network with new parameters\n",
         "  model_out = shallow_nn(x_train, beta_0, omega_0, beta_1, omega_1)\n",
         "  lambda_train = sigmoid(model_out)\n",
-        "  # Compute and store the three values\n",
+        "  # Compute and store the two values\n",
         "  likelihoods[count] = compute_likelihood(y_train,lambda_train)\n",
         "  nlls[count] = compute_negative_log_likelihood(y_train, lambda_train)\n",
         "  # Draw the model for every 20th parameter setting\n",
@@ -378,7 +377,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# Now let's plot the likelihood, negative log likelihood, and least squares as a function the value of the offset beta1\n",
+        "# Now let's plot the likelihood and negative log likelihood as a function of the value of the offset beta1\n",
         "fig, ax = plt.subplots()\n",
         "fig.tight_layout(pad=5.0)\n",
         "likelihood_color = 'tab:red'\n",
@@ -430,12 +429,12 @@
       "source": [
         "They both give the same answer. But you can see from the likelihood above that the likelihood is very small unless the parameters are almost correct.  So in practice, we would work with the negative log likelihood.<br><br>\n",
         "\n",
-        "Again, to fit the full neural model we would vary all of the 10 parameters of the network in the $\\boldsymbol\\beta_{0},\\boldsymbol\\omega_{0},\\boldsymbol\\beta_{1},\\boldsymbol\\omega_{1}$ until we find the combination that have the maximum likelihood / minimum negative log likelihood.<br><br>\n",
+        "Again, to fit the full neural model we would vary all of the 10 parameters of the network in the $\\boldsymbol\\beta_{0},\\boldsymbol\\Omega_{0},\\boldsymbol\\beta_{1},\\boldsymbol\\Omega_{1}$ until we find the combination that have the maximum likelihood / minimum negative log likelihood.<br><br>\n",
         "\n"
       ],
       "metadata": {
         "id": "771G8N1Vk5A2"
       }
     }
   ]
-}
+}