Stars
[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Implementation of Universal Transformer in Pytorch
Understanding the Difficulty of Training Transformers
Transformer-based image captioning extension for pytorch/fairseq
Based on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, Ro…
Inverse Compositional Spatial Transformer Networks 🎭 (CVPR 2017 oral)
ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing 👓 (CVPR 2018)
Universal Graph Transformer Self-Attention Networks (TheWebConf WWW 2022) (Pytorch and Tensorflow)
Transformer based on a variant of attention that is linear complexity in respect to sequence length
Meshed-Memory Transformer for Image Captioning. CVPR 2020
A collection of resources to study Transformers in depth.
Code for "Text Generation from Knowledge Graphs with Graph Transformers"
[ACL'19] [PyTorch] Multimodal Transformer
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
A Visual Analysis Tool to Explore Learned Representations in Transformers Models
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Graph Transformer Networks (Authors' PyTorch implementation for the NeurIPS 19 paper)
Recent Transformer-based CV and related works.
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Graphormer is a general-purpose deep learning backbone for molecular modeling.
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Pytorch library for fast transformer implementations
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
A concise but complete full-attention transformer with a set of promising experimental features from various papers
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI