Transformer_Knowlegde

从底层机理了解Transformer

知乎专栏：https://www.zhihu.com/column/c_1380571607480651776

1. 线性self-attention

1.1 稀疏Attention（sparse attention)

讲解：https://zhuanlan.zhihu.com/p/469853664

《Generating Long Sequences with Sparse Transformers》 19年4月 OpenAI

https://arxiv.org/abs/1904.10509

《Sparse Transformer: Concentrated Attention Through Explicit Selection》 19年12月

https://arxiv.org/pdf/1912.11637.pdf

《Longformer: The Long-Document Transforme》 20年4月

https://arxiv.org/abs/2004.05150

《Reformer: The Efficient Transformer》 20年1月

https://arxiv.org/abs/2001.04451

Linformer: Self-Attention with Linear Complexity

https://arxiv.org/abs/2006.04768

**Rethinking Attention with Performers ** 2020年9月

https://arxiv.org/abs/2009.14794

Luna: Linear Unified Nested Attention 2021年6月

https://arxiv.org/abs/2106.01540

1.2 softmax线性化（linear softmax）

讲解：https://zhuanlan.zhihu.com/p/471291695

Efficient Attention: Attention with Linear Complexities 18年12月

https://arxiv.org/abs/1812.01243

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention 20年6月

https://arxiv.org/abs/2006.16236

COSFORMER : RETHINKING SOFTMAX IN ATTENTION 22年2月

https://arxiv.org/pdf/2202.08791.pdf

2. PostNorm/PreNorm

Content is constantly updated... ...

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer_Knowlegde

1. 线性self-attention

1.1 稀疏Attention（sparse attention)

1.2 softmax线性化（linear softmax）

2. PostNorm/PreNorm

About

Releases

Packages

Contributors 2

WangHuiNEU/Transformer_Knowlegde

Folders and files

Latest commit

History

Repository files navigation

Transformer_Knowlegde

1. 线性self-attention

1.1 稀疏Attention（sparse attention)

1.2 softmax线性化 （linear softmax）

2. PostNorm/PreNorm

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

1.2 softmax线性化（linear softmax）

Packages