Skip to content

Commit

Permalink
Edits in "Notation"
Browse files Browse the repository at this point in the history
  • Loading branch information
VladKha authored May 14, 2018
1 parent 8bac320 commit f4bf503
Showing 1 changed file with 28 additions and 24 deletions.
52 changes: 28 additions & 24 deletions 5- Sequence Models/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,35 +102,39 @@ Here are the course summary as its given on the course [link](https://www.course

- In this section we will discuss the notations that we will use through the course.
- **Motivating example**:
- In the content of name entity recognition application let:
- Named entity recognition example:
- X: "Harry Potter and Hermoine Granger invented a new spell."
- Y: 1 1 0 1 1 0 0 0 0
- Both elements has a shape of 9. 1 means its a name, while 0 means its not a name.
- We will index the first element of X by X<sup><1></sup>, the second X<sup><2></sup> and so on.
- X<sup><1></sup> = Harry
- X<sup><2></sup> = Potter
- Similarly, we will index the first element of Y by Y<sup><1></sup>, the second Y<sup><2></sup> and so on.
- Y<sup><1></sup> = 1
- Y<sup><2></sup> = 1
- X<sup>\<t></sup> gets an element by index t.
- We will index the first element of x by x<sup><1></sup>, the second x<sup><2></sup> and so on.
- x<sup><1></sup> = Harry
- x<sup><2></sup> = Potter
- Similarly, we will index the first element of y by y<sup><1></sup>, the second y<sup><2></sup> and so on.
- y<sup><1></sup> = 1
- y<sup><2></sup> = 1

- T<sub>x</sub> is the size of the input sequence and T<sub>y</sub> is the size of the output sequence.
- T<sub>x</sub> = T<sub>y</sub> = 9 in the last example although they can be different in other problems than name entity one.
- X<sub>(i)</sub><sup>\<t></sup> is the element t of the sequence i in the training. Similarly for Y
- T<sub>x</sub> <sup>(i)</sup> is the size of the input sequence i. It can be different across the sets. Similarly for Y
- T<sub>x</sub> = T<sub>y</sub> = 9 in the last example although they can be different in other problems.
- x<sup>(i)\<t></sup> is the element t of the sequence of input vector i. Similarly y<sup>(i)\<t></sup> means the t-th element in the output sequence of the i training example.
- T<sub>x</sub><sup>(i)</sup> the input sequence length for training example i. It can be different across the examples. Similarly for T<sub>y</sub><sup>(i)</sup> will be the length of the output sequence in the i-th training example.

- **Representing words**:
- We will now work in this course with **NLP** which stands for nature language processing. One of the challenges of NLP is how can we represent a word?
- <u>The first thing</u> we need a **vocabulary** list that contains all the words in our target sets.
- Example:
- [a ... And ... Harry ... Potter ... Zulu ]
- Each word will have a unique index that it can be represented with.
- The sorting here is by alphabetic order.
- Vocabulary sizes in modern applications are from 30,000 to 50,000. 100,000 is not uncommon. Some of the bigger companies uses a million.
- To build vocabulary list, you can read all the text you have and get m words with the most occurrence, or search online for m most occurrence words.
- <u>The next step</u> is to create a one **hot encoding sequence** for each word in your dataset given the vocabulary you have created.
- While converting, what if you meet a word thats not in your dictionary?
- Well you can add a token in the vocabulary `<UNK>` which stands for unknown text and use its index in filling your one hot vector.
- Full example can be found here:
- ![](Images/01.png)
- We will now work in this course with **NLP** which stands for natural language processing. One of the challenges of NLP is how can we represent a word?

1. We need a **vocabulary** list that contains all the words in our target sets.
- Example:
- [a ... And ... Harry ... Potter ... Zulu]
- Each word will have a unique index that it can be represented with.
- The sorting here is in alphabetical order.
- Vocabulary sizes in modern applications are from 30,000 to 50,000. 100,000 is not uncommon. Some of the bigger companies use even a million.
- To build vocabulary list, you can read all the texts you have and get m words with the most occurrence, or search online for m most occurrent words.
2. Create a **one-hot encoding** sequence for each word in your dataset given the vocabulary you have created.
- While converting, what if we meet a word thats not in your dictionary?
- We can add a token in the vocabulary with name `<UNK>` which stands for unknown text and use its index for your one-hot vector.
- Full example:
![](Images/01.png)

- The goal is given this representation for x to learn a mapping using a sequence model to then target output y as a supervised learning problem.

### Recurrent Neural Network Model
- Why not a standard network for sequence problems? There are two problems:
Expand Down

0 comments on commit f4bf503

Please sign in to comment.