Skip to content

Commit

Permalink
lab2
Browse files Browse the repository at this point in the history
  • Loading branch information
briandalessandro committed Sep 13, 2017
1 parent beeefe5 commit f8fbc13
Showing 1 changed file with 61 additions and 17 deletions.
78 changes: 61 additions & 17 deletions ipython/Labs_Student/Lab2_NumPy_Vectorization_Student.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -24,7 +24,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -49,7 +49,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -60,7 +60,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -86,7 +86,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -110,7 +110,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -124,7 +124,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -144,7 +144,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -158,19 +158,62 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We should expect the distribution to be centered around zero. Is it?"
"We should expect the distribution to be centered around zero. Is it? As fun technical side, let's dive a little deeper into what this distribution should look like. The histogram shows a distribution of the average of a sample of 5 uniformly distributed random variables taken over N different samples. Can we compare this to a theoretical distribution?<br>\n",
"\n",
"Yes we can! We sampled each $\\beta_i$ from a uniform distribution over the interval $[-1, 1]$. The variance of a sample of uniformly distributed variables is given by $(1/12) * (b - a)^2$, where $b$ and $a$ are the min/max of the support interval. The standard error (or the standard deviation of the mean) of a sample of size K with with $Var(X) = \\sigma^2$ is $\\sigma / \\sqrt(K)$. <br>\n",
"\n",
"Given the above knowledge, we should expect our distribution of averages to be normally distributed with mean = 0 and var = $(12 * 5)^{-1} * (1 - (-1))^2 = 0.66667$. Let's compare this normal distribution to our sample above."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"#Compute a vector from the normal distribution specified above\n",
"from scipy.stats import norm\n",
"mu = 0\n",
"sig = np.sqrt(4 / 60.0) \n",
"xs = np.linspace(-1, 1, 1000)\n",
"ys = norm.pdf(xs, mu, sig) \n",
"\n",
"plt.hist(means, normed = True)\n",
"plt.plot(xs, ys)\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's write our scoring function. Let's try to use as much of Numpy's inner optimization as possible (hint, this can be done in two lines and without writing any loops). The key is that numpy functions that would normally take in a scalar can also take in an array, and the function applies the operations element wise to the array and returns an array. i.e.:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"ex_array = np.array([-1, 1])\n",
"np.abs(ex_array)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's write our scoring function. Let's try to use as much of Numpy's inner optimization as possible (hint, this can be done in two lines and without writing any loops)."
"Let's use this feature to write a fast and clean scoring function"
]
},
{
"cell_type": "code",
"execution_count": 45,
"execution_count": null,
"metadata": {
"collapsed": true
},
Expand Down Expand Up @@ -201,7 +244,7 @@
},
{
"cell_type": "code",
"execution_count": 44,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -219,14 +262,15 @@
" \n",
" xb = 0\n",
" for i, el in enumerate(row):\n",
" xb += el * beta[i]\n",
" #Student - compute X*Beta in the loop\n",
" \n",
" xbeta.append(xb)\n",
" \n",
" #Now let's apply the link function to each xbeta\n",
" prob_score = []\n",
" for xb in xbeta:\n",
" prob_score.append(1 / (1 + np.exp(-1 * xb)))\n",
" #student - compute p in the loop \n",
" prob_score.append(p)\n",
" \n",
" return prob_score"
]
Expand All @@ -240,7 +284,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -258,7 +302,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand All @@ -269,7 +313,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": null,
"metadata": {
"collapsed": false
},
Expand Down

0 comments on commit f8fbc13

Please sign in to comment.