Skip to content

Commit

Permalink
Merge pull request mbadry1#127 from VladKha/patch-7
Browse files Browse the repository at this point in the history
Edits in "GloVe word vectors"
  • Loading branch information
mbadry1 authored Jun 3, 2018
2 parents 8e9d4ef + 5c5a4d3 commit 3a584e3
Showing 1 changed file with 15 additions and 15 deletions.
30 changes: 15 additions & 15 deletions 5- Sequence Models/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -580,26 +580,26 @@ Here are the course summary as its given on the course [link](https://www.course
![](Images/43.png)

#### GloVe word vectors
- GloVe is another algorithm for learning the word embeddings, Its the simplest of them.
- This is not used much as word2vec or gram models, but it has some enthusiasts because of its simplicity.
- GloVe stands for Global vectors for word presentation.
- Given this example:
- "I want a glass of orange juice to go along with my cereal"
- GloVe is another algorithm for learning the word embedding. It's the simplest of them.
- This is not used as much as word2vec or skip-gram models, but it has some enthusiasts because of its simplicity.
- GloVe stands for Global vectors for word representation.
- Let's use our previous example: "I want a glass of orange juice to go along with my cereal".
- We will choose a context and a target from the choices we have mentioned in the previous sections.
- Then we will calculate this for every pair, X<sub>ct</sub> = # times `t` appears in context of `c`
- Then we will calculate this for every pair: X<sub>ct</sub> = # times `t` appears in context of `c`
- X<sub>ct</sub> = X<sub>tc</sub> if we choose a window pair, but they will not equal if we choose the previous words for example. In GloVe they use a window which means they are equal
- The model is defined like this:
- ![](Images/44.png)
- f(x) -the weighting term- is used for many reasons which includes:
- The model is defined like this:
![](Images/44.png)
- f(x) - the weighting term, used for many reasons which include:
- The `log(0)` problem, which might occur if there are no pairs for the given target and context values.
- Giving low weights for stop words like "is", "the", and "this" because they occurs a lot.
- Giving low weights for words that doesn't occur so much.
- ceta and e are symmetric which helps getting the final word embedding.
- Conclusion on word embeddings:
- If this is your first try, you should try to download a pretrained model that has been made and actually works best.
- Giving not too much weight for stop words like "is", "the", and "this" which occur many times.
- Giving not too little weight for infrequent words.
- **Theta** and **e** are symmetric which helps getting the final word embedding.

- _Conclusions on word embeddings:_
- If this is your first try, you should try to download a pre-trained model that has been made and actually works best.
- If you have enough data, you can try to implement one of the available algorithms.
- Because word embeddings are very computationally expensive to train, most ML practitioners will load a pre-trained set of embeddings.
- A final note that you can't guarantee that the axis used to represent the features will be well-aligned with what might be easily humanly interpretable axis like gender, and royal, and age.
- A final note that you can't guarantee that the axis used to represent the features will be well-aligned with what might be easily humanly interpretable axis like gender, royal, age.

### Applications using Word Embeddings

Expand Down

0 comments on commit 3a584e3

Please sign in to comment.