Skip to content

This notebook, which is part of the EPFL's Visual Intelligence course assignments, implements a vision transformer for classification as well as a GPT model for image generation.

Notifications You must be signed in to change notification settings

codamin/Vision-Transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

This notebook is for assignment 1 of the CS-503 Visual Intelligence course at EPFL by Prof. Amir Zamir.

The goals of this assignment are to:

  • Implement a Vision Transformer for MNIST classification
  • Implement a GPT decoder model for image generation

Topics covered in this assignment:

  • Self-attention
  • Basic tokenization
  • Basic positional encodings
  • Transformer encoder-only (e.g. ViT) and decoder-only (e.g. GPT) models
  • Vision Transformer (ViT)
  • Supervised training
  • Autoregressive modelling

About

This notebook, which is part of the EPFL's Visual Intelligence course assignments, implements a vision transformer for classification as well as a GPT model for image generation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published