Merge pull request mbadry1#167 from dotslash21/master

Fixed minor formatting issues
raminetinati · Aug 3, 2019 · fe6dafe · fe6dafe
2 parents cdcadc0 + e02a3cb
commit fe6dafe
Show file tree

Hide file tree

Showing 2 changed files with 6 additions and 6 deletions.
diff --git a/1- Neural Networks and Deep Learning/Readme.md b/1- Neural Networks and Deep Learning/Readme.md
@@ -227,13 +227,13 @@ Here are the course summary as its given on the course [link](https://www.course
 - Lets say we have these variables:
 
   ```
-  	X1					Feature
+  	X1                  Feature
   	X2                  Feature
   	W1                  Weight of the first feature.
   	W2                  Weight of the second feature.
   	B                   Logistic Regression parameter.
   	M                   Number of training examples
-  	Y(i)				Expected output of i
+  	Y(i)                Expected output of i
   ```
 
 - So we have:
@@ -246,7 +246,7 @@ Here are the course summary as its given on the course [link](https://www.course
   	d(z)  = d(l)/d(z) = a - y
   	d(W1) = X1 * d(z)
   	d(W2) = X2 * d(z)
-  	d(B) = d(z)
+  	d(B)  = d(z)
   ```
 
 - From the above we can conclude the logistic regression pseudo code:
@@ -472,7 +472,7 @@ Here are the course summary as its given on the course [link](https://www.course
 - Derivation of Sigmoid activation function:
 
   ```
-  g(z) = 1 / (1 + np.exp(-z))
+  g(z)  = 1 / (1 + np.exp(-z))
   g'(z) = (1 / (1 + np.exp(-z))) * (1 - (1 / (1 + np.exp(-z))))
   g'(z) = g(z) * (1 - g(z))
   ```

diff --git a/2- Improving Deep Neural Networks/Readme.md b/2- Improving Deep Neural Networks/Readme.md
@@ -299,7 +299,7 @@ _**Implementation tip**_: if you implement gradient descent, one of the steps to
   ```
   np.random.rand(shape) * np.sqrt(2/n[l-1])
   ```
-- Number 1 or 2 in the nominator can also be a hyperparameter to tune (but not the first to start with)
+- Number 1 or 2 in the neumerator can also be a hyperparameter to tune (but not the first to start with)
 - This is one of the best way of partially solution to Vanishing / Exploding gradients (ReLU + Weight Initialization with variance) which will help gradients not to vanish/explode too quickly
 - The initialization in this video is called "He Initialization / Xavier Initialization" and has been published in 2015 paper.
 
@@ -605,7 +605,7 @@ Implications of L2-regularization on:
   6. Learning rate decay.
   7. Regularization lambda.
   8. Activation functions.
-  9. Adam `beta1` & `beta2`.
+  9. Adam `beta1`, `beta2` & `epsilon`.
 - Its hard to decide which hyperparameter is the most important in a problem. It depends a lot on your problem.
 - One of the ways to tune is to sample a grid with `N` hyperparameter settings and then try all settings combinations on your problem.
 - Try random values: don't use a grid.