Skip to content

Commit

Permalink
Edits in "Debiasing word embeddings"
Browse files Browse the repository at this point in the history
  • Loading branch information
VladKha authored Jul 2, 2018
1 parent 1b95983 commit 755e789
Showing 1 changed file with 14 additions and 14 deletions.
28 changes: 14 additions & 14 deletions 5- Sequence Models/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -620,17 +620,17 @@ Here are the course summary as its given on the course [link](https://www.course
- Also, it will generalize better even if words weren't in your dataset. For example you have the sentence "Completely **<u>absent</u>** of good taste, good service, and good ambience", then even if the word "absent" is not in your label training set, if it was in your 1 billion or 100 billion word corpus used to train the word embeddings, it might still get this right and generalize much better even to words that were in the training set used to train the word embeddings but not necessarily in the label training set that you had for specifically the sentiment classification problem.

#### Debiasing word embeddings
- We want to make sure that our word embeddings free from undesirable forms of bias, such as gender bias, ethnicity bias and so on.
- A horrifying result on a trained word embeddings in the context of Analogies:
- We want to make sure that our word embeddings are free from undesirable forms of bias, such as gender bias, ethnicity bias and so on.
- Horrifying results on the trained word embeddings in the context of Analogies:
- Man : Computer_programmer as Woman : **Homemaker**
- Father : Doctor as Mother : **Nurse**
- Word embeddings can reflect gender, ethnicity, age, sexual orientation, and other biases of text used to train the model.
- Learning algorithms by general is making an important decision and it mustn't be biased.
- Learning algorithms by general are making important decisions and it mustn't be biased.
- Andrew thinks we actually have better ideas for quickly reducing the bias in AI than for quickly reducing the bias in the human race, although it still needs a lot of work to be done.
- Addressing bias in word embeddings steps:
- Idea is by paper: https://arxiv.org/abs/1607.06520
- Given this learned embeddings:
- ![](Images/48.png)
- Idea from the paper: https://arxiv.org/abs/1607.06520
- Given these learned embeddings:
![](Images/48.png)
- We need to solve the **gender bias** here. The steps we will discuss can help solve any bias problem but we are focusing here on gender bias.
- Here are the steps:
1. Identify the direction:
Expand All @@ -639,23 +639,23 @@ Here are the course summary as its given on the course [link](https://www.course
- e<sub>male</sub> - e<sub>female</sub>
- ....
- Choose some k differences and average them.
- This will help you find this:
- ![](Images/49.png)
- This will help you find this:
![](Images/49.png)
- By that we have found the bias direction which is 1D vector and the non-bias vector which is 299D vector.
2. Neutralize: For every word that is not definitional, project to get rid of bias.
- Babysitter and doctor needs to be neutral so we project them on nonbias with the direction of the bias:
- ![](Images/50.png)
- Babysitter and doctor need to be neutral so we project them on non-bias axis with the direction of the bias:
![](Images/50.png)
- After that they will be equal in the term of gender.
        - To do this the authors of the paper trained a classifier to tell the words that they need to be neutralized or not.
        - To do this the authors of the paper trained a classifier to tell the words that need to be neutralized or not.
3. Equalize pairs
- We want each pair to have difference only in gender. Like:
- Grandfather - Grandmother
        - He - She
        - Boy - Girl
- We want to do this because the distance between grandfather and babysitter is bigger than babysitter and grandmother:
- ![](Images/51.png)
- We want to do this because the distance between grandfather and babysitter is bigger than babysitter and grandmother:
![](Images/51.png)
- To do that, we move grandfather and grandmother to a point where they will be in the middle of the non-bias axis.
- There are some words you need to do this for in your steps. The size of these words are relatively small.
- There are some words you need to do this for in your steps. Number of these words is relatively small.


## Sequence models & Attention mechanism
Expand Down

0 comments on commit 755e789

Please sign in to comment.