Skip to content

Commit

Permalink
Added intuition behind exponentially weighted averages
Browse files Browse the repository at this point in the history
  • Loading branch information
Kaushal28 authored Jul 30, 2019
1 parent 62e6aa0 commit 91203d0
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions 2- Improving Deep Neural Networks/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -460,6 +460,9 @@ Implications of L2-regularization on:
- `beta = 0.98` will average last 50 entries
- `beta = 0.5` will average last 2 entries
- Best beta average for our case is between 0.9 and 0.98
- Intuition: The reason why exponentially weighted averages are useful for further optimizing gradient descent algorithm is that it
can give different weights to recent data points (`theta`) based on value of `beta`. If `beta` is high (around 0.9), it smoothens
out the averages of skewed data points (oscillations w.r.t. Gradient descent terminology). So this reduces oscillations in gradient descent and hence makes faster and smoother path towerds minima.
- Another imagery example:
![](Images/Nasdaq1_small.png)
_(taken from [investopedia.com](https://www.investopedia.com/))_
Expand Down

0 comments on commit 91203d0

Please sign in to comment.