Edits in "Notation"

boogaru · May 14, 2018 · f4bf503 · f4bf503
1 parent 8bac320
commit f4bf503
Showing 1 changed file with 28 additions and 24 deletions.
diff --git a/5- Sequence Models/Readme.md b/5- Sequence Models/Readme.md
@@ -102,35 +102,39 @@ Here are the course summary as its given on the course [link](https://www.course
 
 - In this section we will discuss the notations that we will use through the course.
 - **Motivating example**:
-  - In the content of name entity recognition application let:
+  - Named entity recognition example:
     - X: "Harry Potter and Hermoine Granger invented a new spell."
     - Y:   1   1   0   1   1   0   0   0   0
     - Both elements has a shape of 9. 1 means its a name, while 0 means its not a name.
-- We will index the first element of X by X<sup><1></sup>, the second X<sup><2></sup> and so on.
-  - X<sup><1></sup> = Harry
-  - X<sup><2></sup> = Potter
-- Similarly, we will index the first element of Y by Y<sup><1></sup>, the second Y<sup><2></sup> and so on.
-  - Y<sup><1></sup> = 1
-  - Y<sup><2></sup> = 1
-- X<sup>\<t></sup> gets an element by index t.
+- We will index the first element of x by x<sup><1></sup>, the second x<sup><2></sup> and so on.
+  - x<sup><1></sup> = Harry
+  - x<sup><2></sup> = Potter
+- Similarly, we will index the first element of y by y<sup><1></sup>, the second y<sup><2></sup> and so on.
+  - y<sup><1></sup> = 1
+  - y<sup><2></sup> = 1
+
 - T<sub>x</sub> is the size of the input sequence and T<sub>y</sub> is the size of the output sequence.
-  - T<sub>x</sub> = T<sub>y</sub> = 9 in the last example although they can be different in other problems than name entity one.
-- X<sub>(i)</sub><sup>\<t></sup> is the element t of the sequence i in the training. Similarly for Y
-- T<sub>x</sub> <sup>(i)</sup> is the size of the input sequence i.  It can be different across the sets. Similarly for Y
+  - T<sub>x</sub> = T<sub>y</sub> = 9 in the last example although they can be different in other problems.
+- x<sup>(i)\<t></sup> is the element t of the sequence of input vector i. Similarly y<sup>(i)\<t></sup> means the t-th element in the output sequence of the i training example.
+- T<sub>x</sub><sup>(i)</sup> the input sequence length for training example i. It can be different across the examples. Similarly for T<sub>y</sub><sup>(i)</sup> will be the length of the output sequence in the i-th training example.
+
 - **Representing words**:
-  - We will now work in this course with **NLP** which stands for nature language processing. One of the challenges of NLP is how can we represent a word?
-  - <u>The first thing</u> we need a **vocabulary** list that contains all the words in our target sets.
-    - Example:
-      - [a ... And   ... Harry ... Potter ... Zulu ]
-      - Each word will have a unique index that it can be represented with.
-      - The sorting here is by alphabetic order.
-  - Vocabulary sizes in modern applications are from 30,000 to 50,000. 100,000 is not uncommon. Some of the bigger companies uses a million.
-  - To build vocabulary list, you can read all the text you have and get m words with the most occurrence, or search online for m most occurrence words.
-  - <u>The next step</u> is to create a one **hot encoding sequence** for each word in your dataset given the vocabulary you have created.
-  - While converting, what if you meet a word thats not in your dictionary?
-    - Well you can add a token in the vocabulary `<UNK>` which stands for unknown text and use its index in filling your one hot vector.
-  - Full example can be found here:
-    - ![](Images/01.png)
+    - We will now work in this course with **NLP** which stands for natural language processing. One of the challenges of NLP is how can we represent a word?
+
+    1. We need a **vocabulary** list that contains all the words in our target sets.
+        - Example:
+            - [a ... And   ... Harry ... Potter ... Zulu]
+            - Each word will have a unique index that it can be represented with.
+            - The sorting here is in alphabetical order.
+        - Vocabulary sizes in modern applications are from 30,000 to 50,000. 100,000 is not uncommon. Some of the bigger companies use even a million.
+        - To build vocabulary list, you can read all the texts you have and get m words with the most occurrence, or search online for m most occurrent words.
+    2. Create a **one-hot encoding** sequence for each word in your dataset given the vocabulary you have created.
+        - While converting, what if we meet a word thats not in your dictionary?
+        - We can add a token in the vocabulary with name `<UNK>` which stands for unknown text and use its index for your one-hot vector.
+    - Full example:   
+        ![](Images/01.png)
+
+- The goal is given this representation for x to learn a mapping using a sequence model to then target output y as a supervised learning problem.
 
 ### Recurrent Neural Network Model
 - Why not a standard network for sequence problems? There are two problems: