Merge pull request mbadry1#127 from VladKha/patch-7

Edits in "GloVe word vectors"
umhan35 · Jun 3, 2018 · 3a584e3 · 3a584e3
2 parents 8e9d4ef + 5c5a4d3
commit 3a584e3
Showing 1 changed file with 15 additions and 15 deletions.
diff --git a/5- Sequence Models/Readme.md b/5- Sequence Models/Readme.md
@@ -580,26 +580,26 @@ Here are the course summary as its given on the course [link](https://www.course
     ![](Images/43.png)
 
 #### GloVe word vectors
-- GloVe is another algorithm for learning the word embeddings, Its the simplest of them.
-- This is not used much as word2vec or gram models, but it has some enthusiasts because of its simplicity.
-- GloVe stands for Global vectors for word presentation.
-- Given this example:
-  - "I want a glass of orange juice to go along with my cereal"
+- GloVe is another algorithm for learning the word embedding. It's the simplest of them.
+- This is not used as much as word2vec or skip-gram models, but it has some enthusiasts because of its simplicity.
+- GloVe stands for Global vectors for word representation.
+- Let's use our previous example: "I want a glass of orange juice to go along with my cereal".
 - We will choose a context and a target from the choices we have mentioned in the previous sections.
-- Then we will calculate this for every pair, X<sub>ct</sub> = # times `t` appears in context of `c`
+- Then we will calculate this for every pair: X<sub>ct</sub> = # times `t` appears in context of `c`
 - X<sub>ct</sub> = X<sub>tc</sub> if we choose a window pair, but they will not equal if we choose the previous words for example. In GloVe they use a window which means they are equal
-- The model is defined like this:
-  - ![](Images/44.png)
-- f(x) -the weighting term- is used for many reasons which includes:
+- The model is defined like this:   
+  ![](Images/44.png)
+- f(x) - the weighting term, used for many reasons which include:
   - The `log(0)` problem, which might occur if there are no pairs for the given target and context values.
-  - Giving low weights for stop words like "is", "the", and "this" because they occurs a lot.
-  - Giving low weights for words that doesn't occur so much.
-- ceta and e are symmetric which helps getting the final word embedding. 
-- Conclusion on word embeddings:
-  - If this is your first try, you should try to download a pretrained model that has been made and actually works best.
+  - Giving not too much weight for stop words like "is", "the", and "this" which occur many times.
+  - Giving not too little weight for infrequent words.
+- **Theta** and **e** are symmetric which helps getting the final word embedding. 
+
+- _Conclusions on word embeddings:_
+  - If this is your first try, you should try to download a pre-trained model that has been made and actually works best.
   - If you have enough data, you can try to implement one of the available algorithms.
   - Because word embeddings are very computationally expensive to train, most ML practitioners will load a pre-trained set of embeddings.
-  - A final note that you can't guarantee that the axis used to represent the features will be well-aligned with what might be easily humanly interpretable axis like gender, and royal, and age.
+  - A final note that you can't guarantee that the axis used to represent the features will be well-aligned with what might be easily humanly interpretable axis like gender, royal, age.
 
 ### Applications using Word Embeddings