diff --git a/Notebooks/Chap07/7_1_Backpropagation_in_Toy_Model.ipynb b/Notebooks/Chap07/7_1_Backpropagation_in_Toy_Model.ipynb
index a9eb38b3..e092140a 100644
--- a/Notebooks/Chap07/7_1_Backpropagation_in_Toy_Model.ipynb
+++ b/Notebooks/Chap07/7_1_Backpropagation_in_Toy_Model.ipynb
@@ -1,33 +1,22 @@
{
- "nbformat": 4,
- "nbformat_minor": 0,
- "metadata": {
- "colab": {
- "provenance": [],
- "authorship_tag": "ABX9TyN7JeDgslwtZcwRCOuGuPFt",
- "include_colab_link": true
- },
- "kernelspec": {
- "name": "python3",
- "display_name": "Python 3"
- },
- "language_info": {
- "name": "python"
- }
- },
"cells": [
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {
- "id": "view-in-github",
- "colab_type": "text"
+ "colab_type": "text",
+ "id": "view-in-github"
},
"source": [
""
]
},
{
+ "attachments": {},
"cell_type": "markdown",
+ "metadata": {
+ "id": "pOZ6Djz0dhoy"
+ },
"source": [
"# **Notebook 7.1: Backpropagation in Toy Model**\n",
"\n",
@@ -36,64 +25,63 @@
"Work through the cells below, running each cell in turn. In various places you will see the words \"TO DO\". Follow the instructions at these places and make predictions about what is going to happen or write code to complete the functions.\n",
"\n",
"Contact me at udlbookmail@gmail.com if you find any mistakes or have any suggestions."
- ],
- "metadata": {
- "id": "pOZ6Djz0dhoy"
- }
+ ]
},
{
+ "attachments": {},
"cell_type": "markdown",
+ "metadata": {
+ "id": "1DmMo2w63CmT"
+ },
"source": [
"We're going to investigate how to take the derivatives of functions where one operation is composed with another, which is composed with a third and so on. For example, consider the model:\n",
"\n",
"\\begin{equation}\n",
- " \\mbox{f}[x,\\boldsymbol\\phi] = \\beta_3+\\omega_3\\cdot\\cos\\Bigl[\\beta_2+\\omega_2\\cdot\\exp\\bigl[\\beta_1+\\omega_1\\cdot\\sin[\\beta_0+\\omega_0x]\\bigr]\\Bigr],\n",
+ " \\text{f}[x,\\boldsymbol\\phi] = \\beta_3+\\omega_3\\cdot\\cos\\Bigl[\\beta_2+\\omega_2\\cdot\\exp\\bigl[\\beta_1+\\omega_1\\cdot\\sin[\\beta_0+\\omega_0x]\\bigr]\\Bigr],\n",
"\\end{equation}\n",
"\n",
"with parameters $\\boldsymbol\\phi=\\{\\beta_0,\\omega_0,\\beta_1,\\omega_1,\\beta_2,\\omega_2,\\beta_3,\\omega_3\\}$.
\n",
"\n",
"This is a composition of the functions $\\cos[\\bullet],\\exp[\\bullet],\\sin[\\bullet]$. I chose these just because you probably already know the derivatives of these functions:\n",
"\n",
- "\\begin{eqnarray*}\n",
+ "\\begin{align}\n",
" \\frac{\\partial \\cos[z]}{\\partial z} = -\\sin[z] \\quad\\quad \\frac{\\partial \\exp[z]}{\\partial z} = \\exp[z] \\quad\\quad \\frac{\\partial \\sin[z]}{\\partial z} = \\cos[z].\n",
- "\\end{eqnarray*}\n",
+ "\\end{align}\n",
"\n",
"Suppose that we have a least squares loss function:\n",
"\n",
"\\begin{equation*}\n",
- "\\ell_i = (\\mbox{f}[x_i,\\boldsymbol\\phi]-y_i)^2,\n",
+ "\\ell_i = (\\text{f}[x_i,\\boldsymbol\\phi]-y_i)^2,\n",
"\\end{equation*}\n",
"\n",
"Assume that we know the current values of $\\beta_{0},\\beta_{1},\\beta_{2},\\beta_{3},\\omega_{0},\\omega_{1},\\omega_{2},\\omega_{3}$, $x_i$ and $y_i$. We could obviously calculate $\\ell_i$. But we also want to know how $\\ell_i$ changes when we make a small change to $\\beta_{0},\\beta_{1},\\beta_{2},\\beta_{3},\\omega_{0},\\omega_{1},\\omega_{2}$, or $\\omega_{3}$. In other words, we want to compute the eight derivatives:\n",
"\n",
- "\\begin{eqnarray*}\n",
- "\\frac{\\partial \\ell_i}{\\partial \\beta_{0}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\beta_{1}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\beta_{2}}, \\quad \\frac{\\partial \\ell_i }{\\partial \\beta_{3}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{0}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{1}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{2}}, \\quad\\mbox{and} \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{3}}.\n",
- "\\end{eqnarray*}"
- ],
- "metadata": {
- "id": "1DmMo2w63CmT"
- }
+ "\\begin{align}\n",
+ "\\frac{\\partial \\ell_i}{\\partial \\beta_{0}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\beta_{1}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\beta_{2}}, \\quad \\frac{\\partial \\ell_i }{\\partial \\beta_{3}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{0}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{1}}, \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{2}}, \\quad\\text{and} \\quad \\frac{\\partial \\ell_i}{\\partial \\omega_{3}}.\n",
+ "\\end{align}"
+ ]
},
{
"cell_type": "code",
- "source": [
- "# import library\n",
- "import numpy as np"
- ],
+ "execution_count": null,
"metadata": {
"id": "RIPaoVN834Lj"
},
- "execution_count": null,
- "outputs": []
+ "outputs": [],
+ "source": [
+ "# import library\n",
+ "import numpy as np"
+ ]
},
{
+ "attachments": {},
"cell_type": "markdown",
- "source": [
- "Let's first define the original function for $y$ and the loss term:"
- ],
"metadata": {
"id": "32-ufWhc3v2c"
- }
+ },
+ "source": [
+ "Let's first define the original function for $y$ and the loss term:"
+ ]
},
{
"cell_type": "code",
@@ -112,121 +100,129 @@
]
},
{
+ "attachments": {},
"cell_type": "markdown",
- "source": [
- "Now we'll choose some values for the betas and the omegas and x and compute the output of the function:"
- ],
"metadata": {
"id": "y7tf0ZMt5OXt"
- }
+ },
+ "source": [
+ "Now we'll choose some values for the betas and the omegas and x and compute the output of the function:"
+ ]
},
{
"cell_type": "code",
- "source": [
- "beta0 = 1.0; beta1 = 2.0; beta2 = -3.0; beta3 = 0.4\n",
- "omega0 = 0.1; omega1 = -0.4; omega2 = 2.0; omega3 = 3.0\n",
- "x = 2.3; y =2.0\n",
- "l_i_func = loss(x,y,beta0,beta1,beta2,beta3,omega0,omega1,omega2,omega3)\n",
- "print('l_i=%3.3f'%l_i_func)"
- ],
+ "execution_count": null,
"metadata": {
- "id": "pwvOcCxr41X_",
"colab": {
"base_uri": "https://localhost:8080/"
},
+ "id": "pwvOcCxr41X_",
"outputId": "9541922c-dfc4-4b2e-dfa3-3298812155ce"
},
- "execution_count": null,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"l_i=0.139\n"
]
}
+ ],
+ "source": [
+ "beta0 = 1.0; beta1 = 2.0; beta2 = -3.0; beta3 = 0.4\n",
+ "omega0 = 0.1; omega1 = -0.4; omega2 = 2.0; omega3 = 3.0\n",
+ "x = 2.3; y =2.0\n",
+ "l_i_func = loss(x,y,beta0,beta1,beta2,beta3,omega0,omega1,omega2,omega3)\n",
+ "print('l_i=%3.3f'%l_i_func)"
]
},
{
+ "attachments": {},
"cell_type": "markdown",
+ "metadata": {
+ "id": "u5w69NeT64yV"
+ },
"source": [
"# Computing derivatives by hand\n",
"\n",
"We could compute expressions for the derivatives by hand and write code to compute them directly but some have very complex expressions, even for this relatively simple original equation. For example:\n",
"\n",
- "\\begin{eqnarray*}\n",
+ "\\begin{align}\n",
"\\frac{\\partial \\ell_i}{\\partial \\omega_{0}} &=& -2 \\left( \\beta_3+\\omega_3\\cdot\\cos\\Bigl[\\beta_2+\\omega_2\\cdot\\exp\\bigl[\\beta_1+\\omega_1\\cdot\\sin[\\beta_0+\\omega_0\\cdot x_i]\\bigr]\\Bigr]-y_i\\right)\\nonumber \\\\\n",
"&&\\hspace{0.5cm}\\cdot \\omega_1\\omega_2\\omega_3\\cdot x_i\\cdot\\cos[\\beta_0+\\omega_0 \\cdot x_i]\\cdot\\exp\\Bigl[\\beta_1 + \\omega_1 \\cdot \\sin[\\beta_0+\\omega_0\\cdot x_i]\\Bigr]\\nonumber\\\\\n",
"&& \\hspace{1cm}\\cdot \\sin\\biggl[\\beta_2+\\omega_2\\cdot \\exp\\Bigl[\\beta_1 + \\omega_1 \\cdot \\sin[\\beta_0+\\omega_0\\cdot x_i]\\Bigr]\\biggr].\n",
- "\\end{eqnarray*}"
- ],
- "metadata": {
- "id": "u5w69NeT64yV"
- }
+ "\\end{align}"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "7t22hALp5zkq"
+ },
+ "outputs": [],
"source": [
"dldbeta3_func = 2 * (beta3 +omega3 * np.cos(beta2 + omega2 * np.exp(beta1+omega1 * np.sin(beta0+omega0 * x)))-y)\n",
"dldomega0_func = -2 *(beta3 +omega3 * np.cos(beta2 + omega2 * np.exp(beta1+omega1 * np.sin(beta0+omega0 * x)))-y) * \\\n",
" omega1 * omega2 * omega3 * x * np.cos(beta0 + omega0 * x) * np.exp(beta1 +omega1 * np.sin(beta0 + omega0 * x)) *\\\n",
" np.sin(beta2 + omega2 * np.exp(beta1+ omega1* np.sin(beta0+omega0 * x)))"
- ],
- "metadata": {
- "id": "7t22hALp5zkq"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
+ "attachments": {},
"cell_type": "markdown",
- "source": [
- "Let's make sure this is correct using finite differences:"
- ],
"metadata": {
"id": "iRh4hnu3-H3n"
- }
+ },
+ "source": [
+ "Let's make sure this is correct using finite differences:"
+ ]
},
{
"cell_type": "code",
- "source": [
- "dldomega0_fd = (loss(x,y,beta0,beta1,beta2,beta3,omega0+0.00001,omega1,omega2,omega3)-loss(x,y,beta0,beta1,beta2,beta3,omega0,omega1,omega2,omega3))/0.00001\n",
- "\n",
- "print('dydomega0: Function value = %3.3f, Finite difference value = %3.3f'%(dldomega0_func,dldomega0_fd))"
- ],
+ "execution_count": null,
"metadata": {
- "id": "1O3XmXMx-HlZ",
"colab": {
"base_uri": "https://localhost:8080/"
},
+ "id": "1O3XmXMx-HlZ",
"outputId": "389ed78e-9d8d-4e8b-9e6b-5f20c21407e8"
},
- "execution_count": null,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"dydomega0: Function value = 5.246, Finite difference value = 5.246\n"
]
}
+ ],
+ "source": [
+ "dldomega0_fd = (loss(x,y,beta0,beta1,beta2,beta3,omega0+0.00001,omega1,omega2,omega3)-loss(x,y,beta0,beta1,beta2,beta3,omega0,omega1,omega2,omega3))/0.00001\n",
+ "\n",
+ "print('dydomega0: Function value = %3.3f, Finite difference value = %3.3f'%(dldomega0_func,dldomega0_fd))"
]
},
{
+ "attachments": {},
"cell_type": "markdown",
- "source": [
- "The code to calculate $\\partial l_i/ \\partial \\omega_0$ is a bit of a nightmare. It's easy to make mistakes, and you can see that some parts of it are repeated (for example, the $\\sin[\\bullet]$ term), which suggests some kind of redundancy in the calculations. The goal of this practical is to compute the derivatives in a much simpler way. There will be three steps:"
- ],
"metadata": {
"id": "wS4IPjZAKWTN"
- }
+ },
+ "source": [
+ "The code to calculate $\\partial l_i/ \\partial \\omega_0$ is a bit of a nightmare. It's easy to make mistakes, and you can see that some parts of it are repeated (for example, the $\\sin[\\bullet]$ term), which suggests some kind of redundancy in the calculations. The goal of this practical is to compute the derivatives in a much simpler way. There will be three steps:"
+ ]
},
{
+ "attachments": {},
"cell_type": "markdown",
+ "metadata": {
+ "id": "8UWhvDeNDudz"
+ },
"source": [
"**Step 1:** Write the original equations as a series of intermediate calculations.\n",
"\n",
- "\\begin{eqnarray}\n",
+ "\\begin{align}\n",
"f_{0} &=& \\beta_{0} + \\omega_{0} x_i\\nonumber\\\\\n",
"h_{1} &=& \\sin[f_{0}]\\nonumber\\\\\n",
"f_{1} &=& \\beta_{1} + \\omega_{1}h_{1}\\nonumber\\\\\n",
@@ -235,16 +231,18 @@
"h_{3} &=& \\cos[f_{2}]\\nonumber\\\\\n",
"f_{3} &=& \\beta_{3} + \\omega_{3}h_{3}\\nonumber\\\\\n",
"l_i &=& (f_3-y_i)^2\n",
- "\\end{eqnarray}\n",
+ "\\end{align}\n",
"\n",
"and compute and store the values of all of these intermediate values. We'll need them to compute the derivatives.
This is called the **forward pass**."
- ],
- "metadata": {
- "id": "8UWhvDeNDudz"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ZWKAq6HC90qV"
+ },
+ "outputs": [],
"source": [
"# TODO compute all the f_k and h_k terms\n",
"# Replace the code below\n",
@@ -257,38 +255,22 @@
"h3 = 0\n",
"f3 = 0\n",
"l_i = 0\n"
- ],
- "metadata": {
- "id": "ZWKAq6HC90qV"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
- "source": [
- "# Let's check we got that right:\n",
- "print(\"f0: true value = %3.3f, your value = %3.3f\"%(1.230, f0))\n",
- "print(\"h1: true value = %3.3f, your value = %3.3f\"%(0.942, h1))\n",
- "print(\"f1: true value = %3.3f, your value = %3.3f\"%(1.623, f1))\n",
- "print(\"h2: true value = %3.3f, your value = %3.3f\"%(5.068, h2))\n",
- "print(\"f2: true value = %3.3f, your value = %3.3f\"%(7.137, f2))\n",
- "print(\"h3: true value = %3.3f, your value = %3.3f\"%(0.657, h3))\n",
- "print(\"f3: true value = %3.3f, your value = %3.3f\"%(2.372, f3))\n",
- "print(\"like original = %3.3f, like from forward pass = %3.3f\"%(l_i_func, l_i))\n"
- ],
+ "execution_count": null,
"metadata": {
- "id": "ibxXw7TUW4Sx",
"colab": {
"base_uri": "https://localhost:8080/"
},
+ "id": "ibxXw7TUW4Sx",
"outputId": "4575e3eb-2b16-4e0b-c84e-9c22b443c3ce"
},
- "execution_count": null,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"f0: true value = 1.230, your value = 0.000\n",
"h1: true value = 0.942, your value = 0.000\n",
@@ -300,17 +282,32 @@
"like original = 0.139, like from forward pass = 0.000\n"
]
}
+ ],
+ "source": [
+ "# Let's check we got that right:\n",
+ "print(\"f0: true value = %3.3f, your value = %3.3f\"%(1.230, f0))\n",
+ "print(\"h1: true value = %3.3f, your value = %3.3f\"%(0.942, h1))\n",
+ "print(\"f1: true value = %3.3f, your value = %3.3f\"%(1.623, f1))\n",
+ "print(\"h2: true value = %3.3f, your value = %3.3f\"%(5.068, h2))\n",
+ "print(\"f2: true value = %3.3f, your value = %3.3f\"%(7.137, f2))\n",
+ "print(\"h3: true value = %3.3f, your value = %3.3f\"%(0.657, h3))\n",
+ "print(\"f3: true value = %3.3f, your value = %3.3f\"%(2.372, f3))\n",
+ "print(\"like original = %3.3f, like from forward pass = %3.3f\"%(l_i_func, l_i))\n"
]
},
{
+ "attachments": {},
"cell_type": "markdown",
+ "metadata": {
+ "id": "jay8NYWdFHuZ"
+ },
"source": [
"**Step 2:** Compute the derivatives of $\\ell_i$ with respect to the intermediate quantities that we just calculated, but in reverse order:\n",
"\n",
- "\\begin{eqnarray}\n",
+ "\\begin{align}\n",
"\\quad \\frac{\\partial \\ell_i}{\\partial f_3}, \\quad \\frac{\\partial \\ell_i}{\\partial h_3}, \\quad \\frac{\\partial \\ell_i}{\\partial f_2}, \\quad\n",
- "\\frac{\\partial \\ell_i}{\\partial h_2}, \\quad \\frac{\\partial \\ell_i}{\\partial f_1}, \\quad \\frac{\\partial \\ell_i}{\\partial h_1}, \\quad\\mbox{and} \\quad \\frac{\\partial \\ell_i}{\\partial f_0}.\n",
- "\\end{eqnarray}\n",
+ "\\frac{\\partial \\ell_i}{\\partial h_2}, \\quad \\frac{\\partial \\ell_i}{\\partial f_1}, \\quad \\frac{\\partial \\ell_i}{\\partial h_1}, \\quad\\text{and} \\quad \\frac{\\partial \\ell_i}{\\partial f_0}.\n",
+ "\\end{align}\n",
"\n",
"The first of these derivatives is straightforward:\n",
"\n",
@@ -328,7 +325,7 @@
"\n",
"We can continue in this way, computing the derivatives of the output with respect to these intermediate quantities:\n",
"\n",
- "\\begin{eqnarray}\n",
+ "\\begin{align}\n",
"\\frac{\\partial \\ell_i}{\\partial f_{2}} &=& \\frac{\\partial h_{3}}{\\partial f_{2}}\\left(\n",
"\\frac{\\partial f_{3}}{\\partial h_{3}}\\frac{\\partial \\ell_i}{\\partial f_{3}} \\right)\n",
"\\nonumber \\\\\n",
@@ -336,16 +333,18 @@
"\\frac{\\partial \\ell_i}{\\partial f_{1}} &=& \\frac{\\partial h_{2}}{\\partial f_{1}}\\left( \\frac{\\partial f_{2}}{\\partial h_{2}}\\frac{\\partial h_{3}}{\\partial f_{2}}\\frac{\\partial f_{3}}{\\partial h_{3}}\\frac{\\partial \\ell_i}{\\partial f_{3}} \\right)\\nonumber \\\\\n",
"\\frac{\\partial \\ell_i}{\\partial h_{1}} &=& \\frac{\\partial f_{1}}{\\partial h_{1}}\\left(\\frac{\\partial h_{2}}{\\partial f_{1}} \\frac{\\partial f_{2}}{\\partial h_{2}}\\frac{\\partial h_{3}}{\\partial f_{2}}\\frac{\\partial f_{3}}{\\partial h_{3}}\\frac{\\partial \\ell_i}{\\partial f_{3}} \\right)\\nonumber \\\\\n",
"\\frac{\\partial \\ell_i}{\\partial f_{0}} &=& \\frac{\\partial h_{1}}{\\partial f_{0}}\\left(\\frac{\\partial f_{1}}{\\partial h_{1}}\\frac{\\partial h_{2}}{\\partial f_{1}} \\frac{\\partial f_{2}}{\\partial h_{2}}\\frac{\\partial h_{3}}{\\partial f_{2}}\\frac{\\partial f_{3}}{\\partial h_{3}}\\frac{\\partial \\ell_i}{\\partial f_{3}} \\right).\n",
- "\\end{eqnarray}\n",
+ "\\end{align}\n",
"\n",
"In each case, we have already computed all of the terms except the last one in the previous step, and the last term is simple to evaluate. This is called the **backward pass**."
- ],
- "metadata": {
- "id": "jay8NYWdFHuZ"
- }
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "gCQJeI--Egdl"
+ },
+ "outputs": [],
"source": [
"# TODO -- Compute the derivatives of the output with respect\n",
"# to the intermediate computations h_k and f_k (i.e, run the backward pass)\n",
@@ -358,37 +357,22 @@
"dldf1 = 1\n",
"dldh1 = 1\n",
"dldf0 = 1\n"
- ],
- "metadata": {
- "id": "gCQJeI--Egdl"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
- "source": [
- "# Let's check we got that right\n",
- "print(\"dldf3: true value = %3.3f, your value = %3.3f\"%(0.745, dldf3))\n",
- "print(\"dldh3: true value = %3.3f, your value = %3.3f\"%(2.234, dldh3))\n",
- "print(\"dldf2: true value = %3.3f, your value = %3.3f\"%(-1.683, dldf2))\n",
- "print(\"dldh2: true value = %3.3f, your value = %3.3f\"%(-3.366, dldh2))\n",
- "print(\"dldf1: true value = %3.3f, your value = %3.3f\"%(-17.060, dldf1))\n",
- "print(\"dldh1: true value = %3.3f, your value = %3.3f\"%(6.824, dldh1))\n",
- "print(\"dldf0: true value = %3.3f, your value = %3.3f\"%(2.281, dldf0))"
- ],
+ "execution_count": null,
"metadata": {
- "id": "dS1OrLtlaFr7",
"colab": {
"base_uri": "https://localhost:8080/"
},
+ "id": "dS1OrLtlaFr7",
"outputId": "414f0862-ae36-4a0e-b68f-4758835b0e23"
},
- "execution_count": null,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"dldf3: true value = 0.745, your value = -4.000\n",
"dldh3: true value = 2.234, your value = -12.000\n",
@@ -399,33 +383,25 @@
"dldf0: true value = 2.281, your value = 1.000\n"
]
}
- ]
- },
- {
- "cell_type": "markdown",
- "source": [
- "**Step 3:** Finally, we consider how the loss~$\\ell_{i}$ changes when we change the parameters $\\beta_{\\bullet}$ and $\\omega_{\\bullet}$. Once more, we apply the chain rule:\n",
- "\n",
- "\n",
- "\n",
- "\n",
- "\\begin{eqnarray}\n",
- "\\frac{\\partial \\ell_i}{\\partial \\beta_{k}} &=& \\frac{\\partial f_{k}}{\\partial \\beta_{k}}\\frac{\\partial \\ell_i}{\\partial f_{k}}\\nonumber \\\\\n",
- "\\frac{\\partial \\ell_i}{\\partial \\omega_{k}} &=& \\frac{\\partial f_{k}}{\\partial \\omega_{k}}\\frac{\\partial \\ell_i}{\\partial f_{k}}.\n",
- "\\end{eqnarray}\n",
- "\n",
- "In each case, the second term on the right-hand side was computed in step 2. When $k>0$, we have~$f_{k}=\\beta_{k}+\\omega_k \\cdot h_{k}$, so:\n",
- "\n",
- "\\begin{eqnarray}\n",
- "\\frac{\\partial f_{k}}{\\partial \\beta_{k}} = 1 \\quad\\quad\\mbox{and}\\quad \\quad \\frac{\\partial f_{k}}{\\partial \\omega_{k}} &=& h_{k}.\n",
- "\\end{eqnarray}"
],
- "metadata": {
- "id": "FlzlThQPGpkU"
- }
+ "source": [
+ "# Let's check we got that right\n",
+ "print(\"dldf3: true value = %3.3f, your value = %3.3f\"%(0.745, dldf3))\n",
+ "print(\"dldh3: true value = %3.3f, your value = %3.3f\"%(2.234, dldh3))\n",
+ "print(\"dldf2: true value = %3.3f, your value = %3.3f\"%(-1.683, dldf2))\n",
+ "print(\"dldh2: true value = %3.3f, your value = %3.3f\"%(-3.366, dldh2))\n",
+ "print(\"dldf1: true value = %3.3f, your value = %3.3f\"%(-17.060, dldf1))\n",
+ "print(\"dldh1: true value = %3.3f, your value = %3.3f\"%(6.824, dldh1))\n",
+ "print(\"dldf0: true value = %3.3f, your value = %3.3f\"%(2.281, dldf0))"
+ ]
},
{
"cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "1I2BhqZhGMK6"
+ },
+ "outputs": [],
"source": [
"# TODO -- Calculate the final derivatives with respect to the beta and omega terms\n",
"\n",
@@ -437,38 +413,22 @@
"dldomega1 = 1\n",
"dldbeta0 = 1\n",
"dldomega0 = 1\n"
- ],
- "metadata": {
- "id": "1I2BhqZhGMK6"
- },
- "execution_count": null,
- "outputs": []
+ ]
},
{
"cell_type": "code",
- "source": [
- "# Let's check we got them right\n",
- "print('dldbeta3: Your value = %3.3f, True value = %3.3f'%(dldbeta3, 0.745))\n",
- "print('dldomega3: Your value = %3.3f, True value = %3.3f'%(dldomega3, 0.489))\n",
- "print('dldbeta2: Your value = %3.3f, True value = %3.3f'%(dldbeta2, -1.683))\n",
- "print('dldomega2: Your value = %3.3f, True value = %3.3f'%(dldomega2, -8.530))\n",
- "print('dldbeta1: Your value = %3.3f, True value = %3.3f'%(dldbeta1, -17.060))\n",
- "print('dldomega1: Your value = %3.3f, True value = %3.3f'%(dldomega1, -16.079))\n",
- "print('dldbeta0: Your value = %3.3f, True value = %3.3f'%(dldbeta0, 2.281))\n",
- "print('dldomega0: Your value = %3.3f, Function value = %3.3f, Finite difference value = %3.3f'%(dldomega0, dldomega0_func, dldomega0_fd))"
- ],
+ "execution_count": null,
"metadata": {
- "id": "38eiOn2aHgHI",
"colab": {
"base_uri": "https://localhost:8080/"
},
+ "id": "38eiOn2aHgHI",
"outputId": "1a67a636-e832-471e-e771-54824363158a"
},
- "execution_count": null,
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"dldbeta3: Your value = 1.000, True value = 0.745\n",
"dldomega3: Your value = 1.000, True value = 0.489\n",
@@ -480,16 +440,44 @@
"dldomega0: Your value = 1.000, Function value = 5.246, Finite difference value = 5.246\n"
]
}
+ ],
+ "source": [
+ "# Let's check we got them right\n",
+ "print('dldbeta3: Your value = %3.3f, True value = %3.3f'%(dldbeta3, 0.745))\n",
+ "print('dldomega3: Your value = %3.3f, True value = %3.3f'%(dldomega3, 0.489))\n",
+ "print('dldbeta2: Your value = %3.3f, True value = %3.3f'%(dldbeta2, -1.683))\n",
+ "print('dldomega2: Your value = %3.3f, True value = %3.3f'%(dldomega2, -8.530))\n",
+ "print('dldbeta1: Your value = %3.3f, True value = %3.3f'%(dldbeta1, -17.060))\n",
+ "print('dldomega1: Your value = %3.3f, True value = %3.3f'%(dldomega1, -16.079))\n",
+ "print('dldbeta0: Your value = %3.3f, True value = %3.3f'%(dldbeta0, 2.281))\n",
+ "print('dldomega0: Your value = %3.3f, Function value = %3.3f, Finite difference value = %3.3f'%(dldomega0, dldomega0_func, dldomega0_fd))"
]
},
{
+ "attachments": {},
"cell_type": "markdown",
- "source": [
- "Using this method, we can compute the derivatives quite easily without needing to compute very complicated expressions. In the next practical, we'll apply this same method to a deep neural network."
- ],
"metadata": {
"id": "N2ZhrR-2fNa1"
- }
+ },
+ "source": [
+ "Using this method, we can compute the derivatives quite easily without needing to compute very complicated expressions. In the next practical, we'll apply this same method to a deep neural network."
+ ]
}
- ]
-}
\ No newline at end of file
+ ],
+ "metadata": {
+ "colab": {
+ "authorship_tag": "ABX9TyN7JeDgslwtZcwRCOuGuPFt",
+ "include_colab_link": true,
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ },
+ "language_info": {
+ "name": "python"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}