Merge pull request GokuMohandas#167 from Calinou/fix-typos-notebooks

Fix various typos in notebooks
ravilkarimov · Feb 15, 2020 · 6f244df · 6f244df
2 parents 9d6b353 + cc411a8
commit 6f244df
Show file tree

Hide file tree

Showing 12 changed files with 31 additions and 31 deletions.
diff --git a/notebooks/basic_ml/01_Python.ipynb b/notebooks/basic_ml/01_Python.ipynb
@@ -941,7 +941,7 @@
         }
       },
       "source": [
-        "# If statment with a boolean\n",
+        "# If statement with a boolean\n",
         "x = True\n",
         "if x:\n",
         "    print (\"it worked\")"

diff --git a/notebooks/basic_ml/02_NumPy.ipynb b/notebooks/basic_ml/02_NumPy.ipynb
@@ -556,7 +556,7 @@
         "colab_type": "text"
       },
       "source": [
-        "One of the most common NumPy operations we’ll use in machine learning is matrix multiplication using the dot product. We take the rows of our first matrix (2) and the columns of our second matrix (2) to determine the dot product, giving us an output of `[2 X 2]`. The only requirement is that the inside dimensions match, in this case the frist matrix has 3 columns and the second matrix has 3 rows. \n",
+        "One of the most common NumPy operations we’ll use in machine learning is matrix multiplication using the dot product. We take the rows of our first matrix (2) and the columns of our second matrix (2) to determine the dot product, giving us an output of `[2 X 2]`. The only requirement is that the inside dimensions match, in this case the first matrix has 3 columns and the second matrix has 3 rows. \n",
         "\n",
         "<div align=\"left\">\n",
         "<img src=\"https://raw.githubusercontent.com/practicalAI/images/master/basic_ml/02_Numpy/dot.gif\" width=\"450\">\n",

diff --git a/notebooks/basic_ml/03_Pandas.ipynb b/notebooks/basic_ml/03_Pandas.ipynb
@@ -84,7 +84,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(seed=1234)"
       ],
       "execution_count": 0,
@@ -186,7 +186,7 @@
         "colab_type": "text"
       },
       "source": [
-        "Let's load the data from the CSV file into a Pandas dataframe. The `header=0` signfies that the first row (0th index) is a header row which contains the names of each column in our dataset."
+        "Let's load the data from the CSV file into a Pandas dataframe. The `header=0` signifies that the first row (0th index) is a header row which contains the names of each column in our dataset."
       ]
     },
     {
@@ -355,7 +355,7 @@
         "colab_type": "text"
       },
       "source": [
-        "These are the diferent features: \n",
+        "These are the different features: \n",
         "* **pclass**: class of travel\n",
         "* **name**: full name of the passenger\n",
         "* **sex**: gender\n",

diff --git a/notebooks/basic_ml/04_Linear_Regression.ipynb b/notebooks/basic_ml/04_Linear_Regression.ipynb
@@ -115,7 +115,7 @@
         "colab_type": "text"
       },
       "source": [
-        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initalization strategies in future lessons).\n",
+        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initialization strategies in future lessons).\n",
         "2. Feed inputs $X$ into the model to receive the predictions $\\hat{y}$.\n",
         "  * $\\hat{y} = XW + b$\n",
         "3. Compare the predictions $\\hat{y}$ with the actual target values $y$ using the objective (cost) function to determine the loss $J$. A common objective function for linear regression is mean squarred error (MSE). This function calculates the difference between the predicted and target values and squares it.\n",
@@ -216,7 +216,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -696,7 +696,7 @@
       "source": [
         "# From scratch\n",
         "\n",
-        "Before we use TensorFlow 2.0 + Keras we will implent linear regression from scratch using NumPy so we can:\n",
+        "Before we use TensorFlow 2.0 + Keras we will implement linear regression from scratch using NumPy so we can:\n",
         "1. Absorb the fundamental concepts by implementing from scratch\n",
         "2. Appreciate the level of abstraction TensorFlow provides\n",
         "\n",
@@ -1832,7 +1832,7 @@
         "colab_type": "text"
       },
       "source": [
-        "Linear regression offers the great advantage of being highly interpretable. Each feature has a coefficient which signifies it's importance/impact on the output variable y. We can interpret our coefficient as follows: by increasing X by 1 unit, we increase y by $W$ (~3.65) units. \n",
+        "Linear regression offers the great advantage of being highly interpretable. Each feature has a coefficient which signifies its importance/impact on the output variable y. We can interpret our coefficient as follows: by increasing X by 1 unit, we increase y by $W$ (~3.65) units. \n",
         "\n",
         "**Note**: Since we standardized our inputs and outputs for gradient descent, we need to apply an operation to our coefficients and intercept to interpret them. See proof in the `From scratch` section above."
       ]

diff --git a/notebooks/basic_ml/05_Logistic_Regression.ipynb b/notebooks/basic_ml/05_Logistic_Regression.ipynb
@@ -116,7 +116,7 @@
         "colab_type": "text"
       },
       "source": [
-        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initalization strategies in future lessons).\n",
+        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initialization strategies in future lessons).\n",
         "2. Feed inputs $X$ into the model to receive the logits ($z=XW$). Apply the softmax operation on the logits to get the class probabilies $\\hat{y}$ in one-hot encoded form. For example, if there are three classes, the predicted class probabilities could look like [0.3, 0.3, 0.4]. \n",
         "  * $ \\hat{y} = softmax(z) = softmax(XW) = \\frac{e^{XW_y}}{\\sum_j e^{XW}} $\n",
         "3. Compare the predictions $\\hat{y}$ (ex.  [0.3, 0.3, 0.4]]) with the actual target values $y$ (ex. class 2 would look like [0, 0, 1]) with the objective (cost) function to determine loss $J$. A common objective function for logistics regression is cross-entropy loss. \n",
@@ -217,7 +217,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -799,7 +799,7 @@
       "source": [
         "# From scratch\n",
         "\n",
-        "Before we use TensorFlow 2.0 + Keras we will implent logistic regression from scratch using NumPy so we can:\n",
+        "Before we use TensorFlow 2.0 + Keras we will implement logistic regression from scratch using NumPy so we can:\n",
         "1. Absorb the fundamental concepts by implementing from scratch\n",
         "2. Appreciate the level of abstraction TensorFlow provides\n",
         "\n",

diff --git a/notebooks/basic_ml/06_Multilayer_Perceptrons.ipynb b/notebooks/basic_ml/06_Multilayer_Perceptrons.ipynb
@@ -137,7 +137,7 @@
         "colab_type": "text"
       },
       "source": [
-        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initalization strategies later in this lesson).\n",
+        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initialization strategies later in this lesson).\n",
         "2. Feed inputs $X$ into the model to do the forward pass and receive the probabilities.\n",
         "  * $z_1 = XW_1$\n",
         "  * $a_1 = f(z_1)$\n",
@@ -242,7 +242,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -1321,7 +1321,7 @@
         "colab_type": "text"
       },
       "source": [
-        "The ReLU activation function ($max(0,z)$) is by far the most widely used activation function for neural networks. But as you can see, each activation function has it's own contraints so there are circumstances where you'll want to use different ones. For example, if we need to constrain our outputs between 0 and 1, then the sigmoid activation is the best choice."
+        "The ReLU activation function ($max(0,z)$) is by far the most widely used activation function for neural networks. But as you can see, each activation function has its own constraints so there are circumstances where you'll want to use different ones. For example, if we need to constrain our outputs between 0 and 1, then the sigmoid activation is the best choice."
       ]
     },
     {
@@ -1417,7 +1417,7 @@
         "colab_type": "text"
       },
       "source": [
-        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initalization strategies later in this lesson)."
+        "1. Randomly initialize the model's weights $W$ (we'll cover more effective initialization strategies later in this lesson)."
       ]
     },
     {

diff --git a/notebooks/basic_ml/07_Data_and_Models.ipynb b/notebooks/basic_ml/07_Data_and_Models.ipynb
@@ -157,7 +157,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -372,7 +372,7 @@
         "colab_type": "text"
       },
       "source": [
-        "We want to choose features that have strong predictive signal for our task. If you want to improve performance, you need to continuously do feature engineering by collecting and adding new signals. So you may run into a new feature that has high correlation (orthogonal signal) with your existing features but it may still possess som unique signal to boost your predictive performance. "
+        "We want to choose features that have strong predictive signal for our task. If you want to improve performance, you need to continuously do feature engineering by collecting and adding new signals. So you may run into a new feature that has high correlation (orthogonal signal) with your existing features but it may still possess some unique signal to boost your predictive performance. "
       ]
     },
     {

diff --git a/notebooks/basic_ml/08_Utilities.ipynb b/notebooks/basic_ml/08_Utilities.ipynb
@@ -163,7 +163,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -1067,7 +1067,7 @@
       "source": [
         "<img height=\"45\" src=\"http://bestanimations.com/HomeOffice/Lights/Bulbs/animated-light-bulb-gif-29.gif\" align=\"left\" vspace=\"5px\" hspace=\"10px\">\n",
         "\n",
-        "Callbacks are a great way to customize your training, evaluation and inference loops. View the full list of available callbacks in the [official documentation](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/callbacks). We will also be implementing a custom callback in the next seciton."
+        "Callbacks are a great way to customize your training, evaluation and inference loops. View the full list of available callbacks in the [official documentation](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/callbacks). We will also be implementing a custom callback in the next section."
       ]
     },
     {

diff --git a/notebooks/basic_ml/09_Preprocessing.ipynb b/notebooks/basic_ml/09_Preprocessing.ipynb
@@ -156,7 +156,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -335,7 +335,7 @@
         "    text = re.sub(r\"([?.!,¿])\", r\" \\1 \", text)\n",
         "    text = re.sub(r'[\" \"]+', \" \", text)\n",
         "\n",
-        "    # Remove whitepsaces\n",
+        "    # Remove whitespaces\n",
         "    text = text.rstrip().strip()\n",
         "\n",
         "    return text"

diff --git a/notebooks/basic_ml/10_Convolutional_Neural_Networks.ipynb b/notebooks/basic_ml/10_Convolutional_Neural_Networks.ipynb
@@ -180,7 +180,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -452,7 +452,7 @@
         "colab_type": "text"
       },
       "source": [
-        "We're going to process our input text at the character level and then one-hot encode each character. Each character has a token id and the one hot representation for each character is an array of zeros except for the <token_id> index which is a 1. So in the example below, the letter e is the token index 2 and it's one hot encoded form is an array of zeros except for a 1 at index 2 ( `[0. 0. 1. 0. ... 0.]` )."
+        "We're going to process our input text at the character level and then one-hot encode each character. Each character has a token id and the one hot representation for each character is an array of zeros except for the <token_id> index which is a 1. So in the example below, the letter e is the token index 2 and its one hot encoded form is an array of zeros except for a 1 at index 2 ( `[0. 0. 1. 0. ... 0.]` )."
       ]
     },
     {
@@ -2115,7 +2115,7 @@
       "source": [
         "# Inputs\n",
         "texts = [\"Roger Federer wins the Wimbledon tennis tournament once again.\",\n",
-        "         \"Scientist warn global warming is a serious scientific phenomenom.\"]\n",
+        "         \"Scientist warn global warming is a serious scientific phenomenon.\"]\n",
         "num_samples = len(texts)\n",
         "X_infer = np.array(X_tokenizer.texts_to_sequences(texts))\n",
         "print (f\"X_infer[0] seq:\\n{X_infer[0]}\")\n",
@@ -2225,7 +2225,7 @@
             "        }\n",
             "    },\n",
             "    {\n",
-            "        \"raw_input\": \"Scientist warn global warming is a serious scientific phenomenom.\",\n",
+            "        \"raw_input\": \"Scientist warn global warming is a serious scientific phenomenon.\",\n",
             "        \"preprocessed_input\": \"s c i e n t i s t   w a r n   g l o b a l   w a r m i n g   i s   a   s e r i o u s   s c i e n t i f i c   p h e n o m e n o m .\",\n",
             "        \"probabilities\": {\n",
             "            \"Sci/Tech\": 0.7040546536445618,\n",

diff --git a/notebooks/basic_ml/11_Embeddings.ipynb b/notebooks/basic_ml/11_Embeddings.ipynb
@@ -194,7 +194,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -220,7 +220,7 @@
       "source": [
         "The main idea of embeddings is to have fixed length representations for the tokens in a text regardless of the number of tokens in the vocabulary. So instead of each token representation having the shape [1 X V] where V is vocab size, each token now has the shape [1 X D] where D is the embedding size (usually 50, 100, 200, 300). The numbers in the representation will no longer be 0s and 1s but rather floats that represent that token in a D-dimensional latent space. If the embeddings really did capture the relationship between tokens, then we should be able to inspect this latent space and confirm known relationships (we'll do this soon).\n",
         "\n",
-        "But how do we learn the embeddings the first place? The intuition behind embeddings is that the definition of a token depends on the token itself but on it's context. There are several different ways of doing this:\n",
+        "But how do we learn the embeddings the first place? The intuition behind embeddings is that the definition of a token depends on the token itself but on its context. There are several different ways of doing this:\n",
         "\n",
         "1. Given the word in the context, predict the target word (CBOW - continuous bag of words).\n",
         "2. Given the target word, predict the context word (skip-gram).\n",
@@ -544,7 +544,7 @@
         "colab_type": "text"
       },
       "source": [
-        "What happen's when a word doesn't exist in our vocabulary? We could assign an UNK token which is used for all OOV (out of vocabulary) words or we could use [FastText](https://radimrehurek.com/gensim/models/fasttext.html), which uses character-level n-grams to embed a word. This helps embed rare words, mispelled words, and also words that don't exist in our corpus but are similar to words in our corpus."
+        "What happen's when a word doesn't exist in our vocabulary? We could assign an UNK token which is used for all OOV (out of vocabulary) words or we could use [FastText](https://radimrehurek.com/gensim/models/fasttext.html), which uses character-level n-grams to embed a word. This helps embed rare words, misspelled words, and also words that don't exist in our corpus but are similar to words in our corpus."
       ]
     },
     {

diff --git a/notebooks/basic_ml/12_Recurrent_Neural_Networks.ipynb b/notebooks/basic_ml/12_Recurrent_Neural_Networks.ipynb
@@ -197,7 +197,7 @@
         "colab": {}
       },
       "source": [
-        "# Set seed for reproducability\n",
+        "# Set seed for reproducibility\n",
         "np.random.seed(SEED)\n",
         "tf.random.set_seed(SEED)"
       ],
@@ -2005,7 +2005,7 @@
         "colab_type": "text"
       },
       "source": [
-        "While our simple RNNs so far are great for sequentially processing our inputs, they have quite a few disadvantages. They commonly suffer from exploding or vanishing gradients as a result using the same set of weights ($W_{xh}$ and $W_{hh}$) with each timestep's input. During backpropagation, this can cause gradients to explode (>1) or vanish (<1). If you multiply any number greater than 1 with itself over and over, it moves towards infinity (exploding gradients) and similarily,  If you multiply any number less than 1 with itself over and over, it moves towards zero (vanishing gradients). To mitigate this issue, gated RNNs were devised to selectively retrain information. If you're interested in learning more of the specifics, this [post](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) is a must-read.\n",
+        "While our simple RNNs so far are great for sequentially processing our inputs, they have quite a few disadvantages. They commonly suffer from exploding or vanishing gradients as a result using the same set of weights ($W_{xh}$ and $W_{hh}$) with each timestep's input. During backpropagation, this can cause gradients to explode (>1) or vanish (<1). If you multiply any number greater than 1 with itself over and over, it moves towards infinity (exploding gradients) and similarly,  If you multiply any number less than 1 with itself over and over, it moves towards zero (vanishing gradients). To mitigate this issue, gated RNNs were devised to selectively retrain information. If you're interested in learning more of the specifics, this [post](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) is a must-read.\n",
         "\n",
         "There are two popular types of gated RNNs: Long Short-term Memory (LSTMs) units and Gated Recurrent Units (GRUs).\n",
         "\n",