Skip to content

Commit

Permalink
Fix a typo in GRU page (d2l-ai#2269)
Browse files Browse the repository at this point in the history
* Fix a typo in GRU page

* Fixed a typo in lstm.md
  • Loading branch information
JojiJoseph authored Aug 26, 2022
1 parent 3b40fce commit e8ff88d
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion chapter_recurrent-modern/gru.md
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ They can also skip subsequences by turning on the update gate.
## Exercises

1. Assume that we only want to use the input at time step $t'$ to predict the output at time step $t > t'$. What are the best values for the reset and update gates for each time step?
1. Adjust the hyperparameters and analyze the their influence on running time, perplexity, and the output sequence.
1. Adjust the hyperparameters and analyze their influence on running time, perplexity, and the output sequence.
1. Compare runtime, perplexity, and the output strings for `rnn.RNN` and `rnn.GRU` implementations with each other.
1. What happens if you implement only parts of a GRU, e.g., with only a reset gate or only an update gate?

Expand Down
2 changes: 1 addition & 1 deletion chapter_recurrent-modern/lstm.md
Original file line number Diff line number Diff line change
Expand Up @@ -424,7 +424,7 @@ LSTMs can alleviate vanishing and exploding gradients.

## Exercises

1. Adjust the hyperparameters and analyze the their influence on running time, perplexity, and the output sequence.
1. Adjust the hyperparameters and analyze their influence on running time, perplexity, and the output sequence.
1. How would you need to change the model to generate proper words as opposed to sequences of characters?
1. Compare the computational cost for GRUs, LSTMs, and regular RNNs for a given hidden dimension. Pay special attention to the training and inference cost.
1. Since the candidate memory cell ensures that the value range is between $-1$ and $1$ by using the $\tanh$ function, why does the hidden state need to use the $\tanh$ function again to ensure that the output value range is between $-1$ and $1$?
Expand Down

0 comments on commit e8ff88d

Please sign in to comment.