Attention type #165

ratis86 · 2018-09-18T12:15:17Z

Can somebody tell me what is the type of attention used in this lib? Because I checked against Bahdanau and Luong attentions and it doesn't look like neither or maybe I'm missing something !

ratis86 · 2018-09-18T13:48:37Z

actually after double checking it, it looks like its the dot attention of Luong. Is there a reason that use the dot attention and not the general one?

pskrunner14 · 2018-09-20T13:37:06Z

@ratis86 thanks for pointing this out. There's no particular reason that I'm aware of. You can contact the respective contributor for that. However we're gonna be implementing the general as well as copy attention mechanisms in the coming versions.

rrkarim · 2018-10-06T22:23:49Z

@pskrunner14 And also on this. Whom should I contact?

pskrunner14 · 2018-10-07T04:53:11Z

@CoderINusE you're welcome to submit a PR.

rrkarim · 2018-10-07T09:23:16Z

@pskrunner14 should I pass an additional argument to the attention.forward method or It will be more clear if I create separate classes for different attention models and keep single base class?

pskrunner14 · 2018-10-07T10:31:48Z

@CoderINusE please see copy branch. This feature is partially implemented. Just need to iron out a few bugs and write tests.

lmatz · 2018-10-31T04:42:29Z

I am not sure whether the comment section in current Attention Module is a bit off? "output=tanh(w∗(attn∗context)+b∗output)" does not match with the code or the 5th equation in the paper https://arxiv.org/pdf/1508.04025.pdf unless b is also interpreted as a matrix? Thanks

woaksths · 2020-04-25T04:00:20Z

I think there is a difference between math written in comments and code.
The main difference is that math do linear layer with (attncontext) and concat it with output whereas written codes do concat (attncontext) and output first and after that they do projection linear layer. I am confused that order. Please tell me why there is a gap.

pskrunner14 added contributions welcome feature labels Sep 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attention type #165

Attention type #165

ratis86 commented Sep 18, 2018

ratis86 commented Sep 18, 2018

pskrunner14 commented Sep 20, 2018

rrkarim commented Oct 6, 2018

pskrunner14 commented Oct 7, 2018

rrkarim commented Oct 7, 2018

pskrunner14 commented Oct 7, 2018

lmatz commented Oct 31, 2018

woaksths commented Apr 25, 2020

Attention type #165

Attention type #165

Comments

ratis86 commented Sep 18, 2018

ratis86 commented Sep 18, 2018

pskrunner14 commented Sep 20, 2018

rrkarim commented Oct 6, 2018

pskrunner14 commented Oct 7, 2018

rrkarim commented Oct 7, 2018

pskrunner14 commented Oct 7, 2018

lmatz commented Oct 31, 2018

woaksths commented Apr 25, 2020