Stars
Official PyTorch implementation of the IEEE TETCI 2024 paper LoCATe-GAT
Open-Sora: Democratizing Efficient Video Production for All
[EMNLP 2024] Official code for "Enhancing Temporal Modeling of Video LLMs via Time Gating"
A paper list of some recent Transformer-based CV works.
[NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"
[ICCV'23] Official repository of paper SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
UnetTSF: A Better Performance Linear Complexity Time Series Prediction Model
PyTorch implementation of our PRCV 2024 paper "Adapting Vision-Language Models to Open Classes via Test-Time Prompt Tuning"
[AAAI 2024 Oral] M2CLIP: A Multimodal, Multi-Task Adapting Framework for Video Action Recognition
Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)
High-Resolution Image Synthesis with Latent Diffusion Models
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Official PyTorch Implementation for Testing of TransZero++(TPAMI'22)
[ACM MM2023] PyTorch implementation for paper "Zero-Shot Learning by Harnessing Adversarial Samples"
The first work on zero-shot underwater gesture recognition
Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)
Download DeepMind's Kinetics dataset.
Code for our IJCV 2023 paper "CLIP-guided Prototype Modulating for Few-shot Action Recognition".
Foundation Models for Video Understanding: A Survey
A simple code for plotting figure, colorbar, and cropping with python
Tips for Writing a Research Paper using LaTeX
Myst template for submission to WACV2024