Skip to content
View wanghao9610's full-sized avatar

Highlights

  • Pro

Block or report wanghao9610

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 21 1 Updated Dec 11, 2024

A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability

68 Updated Nov 28, 2024

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 579 61 Updated Jun 7, 2024

[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"

Python 201 10 Updated Nov 19, 2024

UniMD: Towards Unifying Moment retrieval and temporal action Detection

Python 39 1 Updated Jul 5, 2024

Implementation of "VL-Mamba: Exploring State Space Models for Multimodal Learning"

81 Updated Mar 21, 2024
Python 54 4 Updated Aug 12, 2024

CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simpl…

Python 1,007 120 Updated Nov 26, 2024

[ECCV 2024] Tokenize Anything via Prompting

Jupyter Notebook 543 21 Updated Dec 11, 2024

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 10,055 978 Updated Nov 18, 2024

[CVPR2023] Code Release of Aligning Bag of Regions for Open-Vocabulary Object Detection

Python 175 5 Updated Oct 25, 2023

Ethereal Style for Zotero

JavaScript 3,839 122 Updated Dec 12, 2024

Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)

Python 273 13 Updated Feb 23, 2024

Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Python 265 15 Updated May 28, 2024
Python 982 128 Updated Oct 3, 2022

深度学习经典、新论文逐段精读

27,491 2,468 Updated Nov 17, 2024

There can be more than Notion and Miro. AFFiNE(pronounced [ə‘fain]) is a next-gen knowledge base that brings planning, sorting and creating all together. Privacy first, open-source, customizable an…

TypeScript 43,329 2,830 Updated Dec 15, 2024

Official implementation of TMANet.

Python 123 24 Updated Sep 20, 2022

V2rayU,基于v2ray核心的mac版客户端,用于科学上网,使用swift编写,支持trojan,vmess,shadowsocks,socks5等服务协议,支持订阅, 支持二维码,剪贴板导入,手动配置,二维码分享等

19,010 2,899 Updated Oct 24, 2024

PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022

Python 161 7 Updated Sep 18, 2022

Scenic: A Jax Library for Computer Vision Research and Beyond

Python 3,360 444 Updated Dec 5, 2024

Grounded Language-Image Pre-training

Python 2,264 196 Updated Jan 24, 2024

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Python 3,147 536 Updated Dec 26, 2023

2021-2022 International Conferences in Artificial Intelligence, Machine Learning, Computer Vision, Data Mining, Natural Language Processing and Robotics

HTML 861 125 Updated Oct 7, 2021

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

Python 741 53 Updated May 10, 2022

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Python 7,429 1,229 Updated Jul 23, 2024

flownet2-pytorch-module

Python 8 1 Updated Jun 8, 2021

Download papers and supplemental materials from open-access paper website, such as AAAI, AAMAS, AISTATS, COLT, CORL, CVPR, ECCV, ICCV, ICLR, ICML, IJCAI, JMLR, NIPS, RSS, WACV.

Python 239 32 Updated Dec 3, 2024

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

Python 14,064 2,068 Updated Jul 24, 2024
Next