Highlights
Lists (1)
Sort Name ascending (A-Z)
Stars
Stable Diffusion web UI
PyTorch Tutorial for Deep Learning Researchers
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
Supercharge Your LLM Application Evaluations 🚀
Official implementation of Character Region Awareness for Text Detection (CRAFT)
🔥🔥🔥🔥 (Earlier YOLOv7 not official one) YOLO with Transformers and Instance Segmentation, with TensorRT acceleration! 🔥🔥🔥
I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsens…
Vision Transformer (ViT) in PyTorch
From Coarse to Fine: Robust Hierarchical Localization at Large Scale with HF-Net (https://arxiv.org/abs/1812.03506)
The code of "Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting"
An Extendible (General) Continual Learning Framework based on Pytorch - official codebase of Dark Experience for General Continual Learning
Forward-Looking Active REtrieval-augmented generation (FLARE)
[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition
Attention OCR Based On Tensorflow
Low rank adaptation for Vision Transformer
Geometry-Aware Learning of Maps for Camera Localization (CVPR2018)
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"