A treasure chest for visual classification and recognition powered by PaddlePaddle
-
Updated
Dec 4, 2024 - Python
A treasure chest for visual classification and recognition powered by PaddlePaddle
PASSL包含 SimCLR,MoCo v1/v2,BYOL,CLIP,PixPro,simsiam, SwAV, BEiT,MAE 等图像自监督算法以及 Vision Transformer,DEiT,Swin Transformer,CvT,T2T-ViT,MLP-Mixer,XCiT,ConvNeXt,PVTv2 等基础视觉算法
HugsVision is a easy to use huggingface wrapper for state-of-the-art computer vision
Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.
A PaddlePaddle version image model zoo.
(Unofficial) PyTorch implementation of Training Vision Transformers for Image Retrieval(El-Nouby, Alaaeldin, et al. 2021).
[CVPR 2024] Code for our Paper "DeiT-LT: Distillation Strikes Back for Vision Transformer training on Long-Tailed Datasets"
[CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression
This is a warehouse for DeiT-pytorch-model, can be used to train your image dataset
The analysis of several vision-based transformers is the main emphasis of this project, which also analyzes their distinctive properties and evaluates how well they work using a common dataset. The study intends to obtain insights into the strengths and shortcomings of various transformer designs created for computer vision tasks.
VisionTransformer for Tensorflow2
Final assignment in the NLP course at the Technion (IEM097215). In this assignment we propose a novel architecture to handle both Text-to-Image translation and Image-to-Text translation tasks on paired data, using a unified architecture of transformers and CNNs and enforcing cycle consistency.
This repository holds the downstream task of Face Mask Classification performed on Self Currated Custom Dataset with various State of the Art deep learning models like ViT, BeIT, DeIT, LeViT, ConvNeXt, VGG16, EfficientNetV2, RegNet and MobileNetV3.
Image classification with DeiT model, including data preprocessing, k-fold CV, early stopping and model saving.
Implementation of a Paper related to Vision Transformer
This is a warehouse for SBCFormer-pytorch-model, can be used to train your dataset
Image captioning with pretrained encoder on MSCOCO.
Add a description, image, and links to the deit topic page so that developers can more easily learn about it.
To associate your repository with the deit topic, visit your repo's landing page and select "manage topics."