Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tanh(C_t) is not in the final paper #2

Closed
twoletters opened this issue May 20, 2024 · 1 comment
Closed

tanh(C_t) is not in the final paper #2

twoletters opened this issue May 20, 2024 · 1 comment

Comments

@twoletters
Copy link

twoletters commented May 20, 2024

h_t = self.sigmoid(o_tilda) * torch.tanh(c_hat)

On this line, c_hat is passed through tanh, but in the final paper, h_tilda (a.k.a. c_hat here) is only c_t / n_t, so h_t should be:

h_t = self.sigmoid(o_tilda) * c_hat

Excerpt from Section A.2 of the final paper:

Here, the cell input activation function φ is tanh, the hidden state activation function is the identity. φ helps stabilizing the recurrence.

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
Copy link

github-actions bot commented Sep 7, 2024

Stale issue message

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant