Skip to content
View meaquanana's full-sized avatar

Block or report meaquanana

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[TPAMI reviewing] Towards Visual Grounding: A Survey

Shell 111 12 Updated Feb 13, 2025

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,905 444 Updated Jan 12, 2025

A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"

Python 42 2 Updated Jun 13, 2024

Official Jax Implementation of MaskGIT

Jupyter Notebook 490 50 Updated Nov 18, 2022

Taming Transformers for High-Resolution Image Synthesis

Jupyter Notebook 6,055 1,180 Updated Jul 30, 2024

[ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation

Python 33 1 Updated Sep 12, 2024
Python 30 1 Updated Jan 2, 2025

[IROS2024] Camera-Radar Fusion for BEV Map and Object Segmentation

Python 73 8 Updated Mar 6, 2025

[CVPR 2024] Tune-An-Ellipse: CLIP Has Potential to Find What You Want

Python 10 1 Updated Jan 5, 2025

Referring Expression Datasets API

Jupyter Notebook 496 79 Updated Aug 27, 2024

Flickr30K Entities Dataset

MATLAB 168 26 Updated Dec 23, 2018

[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.

Jupyter Notebook 1,849 247 Updated Jan 24, 2024

[ICCV 2021- Oral] Official PyTorch implementation for Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers, a novel method to visualize any Transformer-…

Jupyter Notebook 835 109 Updated Aug 24, 2023

Code for LERF: Language Embedded Radiance Fields

Python 679 69 Updated Jul 9, 2024

PyTorch code and models for the DINOv2 self-supervised learning method.

Jupyter Notebook 9,974 898 Updated Aug 7, 2024

Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion

Python 296 19 Updated Mar 12, 2025

[CVPR 2024] Code for "Improved Visual Grounding through Self-Consistent Explanations".

Python 24 1 Updated Mar 1, 2024

[CVPR 2023] Code for "Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations"

Jupyter Notebook 19 2 Updated Oct 10, 2023

[CVPR 2023] DepGraph: Towards Any Structural Pruning

Python 2,903 344 Updated Mar 5, 2025

Code for ALBEF: a new vision-language pre-training method

Python 1,616 203 Updated Sep 20, 2022

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 14,465 1,501 Updated Dec 25, 2024

Finetuning DINOv2 (https://github.com/facebookresearch/dinov2) on your own dataset

Python 55 3 Updated Jun 8, 2023

Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

Python 1,109 89 Updated Dec 12, 2023

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 10,334 1,005 Updated Nov 18, 2024

(ITSC 2021) Optimising the selection of samples for robust lidar camera calibration. This package estimates the calibration parameters from camera to lidar frame.

C++ 479 109 Updated Oct 4, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,867 128 Updated Oct 30, 2024

4M: Massively Multimodal Masked Modeling

Python 1,692 102 Updated Mar 7, 2025

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,454 4,865 Updated Feb 23, 2025

Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.

Python 13,168 2,921 Updated Mar 12, 2025

OpenMMLab FewShot Learning Toolbox and Benchmark

Python 719 120 Updated Sep 5, 2023
Next