Skip to content
View zhangda1018's full-sized avatar

Block or report zhangda1018

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

cv

46 repositories
Python 111 3 Updated Feb 19, 2024

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

Python 150 20 Updated Aug 18, 2021

This is the official repository of Action Progression Networks for Temporal Action Localization in Videos

Python 2 Updated Jan 5, 2024
Python 44 4 Updated May 18, 2024
Python 32 7 Updated Jan 27, 2024

Count the MACs / FLOPs of your PyTorch model.

Python 4,910 528 Updated Jul 8, 2024

we want to create a repo to illustrate usage of transformers in chinese

Shell 2,431 415 Updated Aug 18, 2024

OVTrack: Open-Vocabulary Multiple Object Tracking [CVPR 2023]

Jupyter Notebook 93 10 Updated Oct 14, 2024

Code for our paper "Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers"

Python 35 2 Updated Oct 4, 2023

Rich-Text-to-Image Generation

Python 769 65 Updated Oct 9, 2023

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,545 586 Updated May 31, 2024

Open-vocabulary Object Segmentation with Diffusion Models

Jupyter Notebook 173 8 Updated Aug 15, 2023

Let us control diffusion models!

Python 30,937 2,776 Updated Feb 25, 2024

[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection

Python 175 5 Updated Oct 25, 2023

A DETR-style framework for open-vocabulary detection (OVD). CVPR 2023

Python 179 16 Updated Apr 16, 2023

(ICCV 2023) MasQCLIP for Open-Vocabulary Universal Image Segmentation

Python 35 2 Updated Oct 18, 2023

[CVPRW'23] "A unified model for continuous conditional video prediction". Xi Ye, Guillaume-Alexandre Bilodeau.

Jupyter Notebook 13 2 Updated Apr 15, 2024

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)

Python 67 4 Updated Jan 27, 2024

Pytorch implementation of SinMPI (SIGGRAPH Asia 2023)

Python 52 4 Updated Aug 23, 2024
Python 68 6 Updated Jan 9, 2024

[ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces

Python 235 15 Updated Jan 10, 2024

Implementation of Parti, Google's pure attention-based text-to-image neural network, in Pytorch

Python 524 24 Updated Dec 8, 2023

Latte: Latent Diffusion Transformer for Video Generation.

Python 1,735 180 Updated Sep 28, 2024

[ECCV 2024] The official code of paper "Open-Vocabulary SAM".

Python 967 30 Updated Jul 31, 2024

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Python 3,073 208 Updated Nov 22, 2024

The repository for paper VPTR: Efficient Transformers for Video Prediction

Python 92 20 Updated Apr 10, 2024

[AAAI'24] "STDiff: Spatio-temporal Diffusion for Continuous Stochastic Video Prediction". Xi Ye, Guillaume-Alexandre Bilodeau

Python 16 3 Updated Apr 14, 2024

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 26,501 3,363 Updated Jul 23, 2024

A Collection of Papers and Codes for CVPR2024/ECCV2024 AIGC

458 12 Updated Nov 13, 2024

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Python 383 15 Updated Oct 31, 2024