Skip to content

Commit

Permalink
Adding CCT
Browse files Browse the repository at this point in the history
Adding Compact Convolutional Transformers (CCT) from Escaping the Big Data
Paradigm with Compact Transformers by Hassani et. al.
https://arxiv.org/abs/2104.05704
  • Loading branch information
stevenwalton committed Jul 1, 2021
1 parent 64a2ef6 commit 8845106
Show file tree
Hide file tree
Showing 2 changed files with 409 additions and 0 deletions.
74 changes: 74 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,69 @@ You can also use the handy `.to_vit` method on the `DistillableViT` instance to
v = v.to_vit()
type(v) # <class 'vit_pytorch.vit_pytorch.ViT'>
```
## CCT
<img src="https://raw.githubusercontent.com/SHI-Labs/Compact-Transformers/main/images/model_sym.png" width="400px"></img>
<a hred="https://arxiv.org/abs/2104.05704">CCT</a> proposes compact transformers
by using convolutions instead of patching and performing sequence pooling. This
allows for CCT to have high accuracy and a low number of parameters.

You can use this with two methods
```python
import torch
from vit_pytorch.cct import CCT

model = CCT(
img_size=224,
embedding_dim=768,
n_input_channels=3,
n_conv_layers=1,
kernel_size=7,
stride=2,
padding=3,
pooling_kernel_size=3,
pooling_stride=2,
pooling_padding=1,
num_layers=12,
num_heads=12,
mlp_radio=4.,
num_classes=1000,
dropout_rate=0.1,
attention_dropout=0.1,
stochastic_depth_rate=0.1,
positional_embedding='sine', # ['sine', 'learnable', 'none']
sequence_length=None,
)
```

Alternatively you can use one of several pre-defined models `[2,4,6,7,8,14,16]`
which pre-define the number of layers, number of attention heads, the mlp ratio,
and the embedding dimension.

```python
import torch
from vit_pytorch.cct import cct_2

model = cct_2(
img_size=224,
n_input_channels=3,
n_conv_layers=1,
kernel_size=7,
stride=2,
padding=3,
pooling_kernel_size=3,
pooling_stride=2,
pooling_padding=1,
num_classes=1000,
dropout_rate=0.1,
attention_dropout=0.1,
stochastic_depth_rate=0.1,
positional_embedding='sine', # ['sine', 'learnable', 'none']
sequence_length=None,
)
```
<a href="https://github.com/SHI-Labs/Compact-Transformers">Official
Repository</a>


## Deep ViT

Expand Down Expand Up @@ -680,6 +743,17 @@ Coming from computer vision and new to transformers? Here are some resources tha


## Citations
```bibtex
@article{hassani2021escaping,
title = {Escaping the Big Data Paradigm with Compact Transformers},
author = {Ali Hassani and Steven Walton and Nikhil Shah and Abulikemu Abuduweili and Jiachen Li and Humphrey Shi},
year = 2021,
url = {https://arxiv.org/abs/2104.05704},
eprint = {2104.05704},
archiveprefix = {arXiv},
primaryclass = {cs.CV}
}
```

```bibtex
@misc{dosovitskiy2020image,
Expand Down
Loading

0 comments on commit 8845106

Please sign in to comment.