Skip to content

WangHuiNEU/Transformer_Knowlegde

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

Transformer_Knowlegde

从底层机理了解Transformer

知乎专栏:https://www.zhihu.com/column/c_1380571607480651776

1. 线性self-attention

1.1 稀疏Attention(sparse attention)

讲解:https://zhuanlan.zhihu.com/p/469853664


  • 《Generating Long Sequences with Sparse Transformers》 19年4月 OpenAI

https://arxiv.org/abs/1904.10509

img


  • 《Sparse Transformer: Concentrated Attention Through Explicit Selection》 19年12月

https://arxiv.org/pdf/1912.11637.pdf

img


  • 《Longformer: The Long-Document Transforme》 20年4月

https://arxiv.org/abs/2004.05150

image-20220304093109970


  • 《Reformer: The Efficient Transformer》 20年1月

https://arxiv.org/abs/2001.04451

image-20220304093336368


https://arxiv.org/abs/2006.04768

image-20220304093639972


  • **Rethinking Attention with Performers ** 2020年9月

https://arxiv.org/abs/2009.14794

image-20220304093832052


  • Luna: Linear Unified Nested Attention 2021年6月

https://arxiv.org/abs/2106.01540

image-20220304093922043

1.2 softmax线性化 (linear softmax)

讲解:https://zhuanlan.zhihu.com/p/471291695


  • Efficient Attention: Attention with Linear Complexities 18年12月

https://arxiv.org/abs/1812.01243

image-20220304094225376


  • Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention 20年6月

https://arxiv.org/abs/2006.16236

image-20220304094354606


  • COSFORMER : RETHINKING SOFTMAX IN ATTENTION 22年2月

https://arxiv.org/pdf/2202.08791.pdf

image-20220304094450162

2. PostNorm/PreNorm

Content is constantly updated... ...

About

从底层机理了解Transformer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published