Skip to content
View seanzhuh's full-sized avatar
🧢
wondering
🧢
wondering

Highlights

  • Pro

Block or report seanzhuh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 4 1 Updated Mar 5, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,118 769 Updated Mar 1, 2025

[CVPR2024] OneFormer3D: One Transformer for Unified Point Cloud Segmentation

Python 400 35 Updated Oct 23, 2024

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 396 16 Updated Jan 13, 2025

系统梳理机器学习的各个知识点。

118 29 Updated Jan 19, 2019

Bird's Eye View Perception

525 29 Updated Mar 3, 2025

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,132 164 Updated Feb 13, 2025

Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM

Python 33 1 Updated Oct 12, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 14,362 1,481 Updated Dec 25, 2024

[ECCV 2024] The official repo for "Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing"

Python 158 5 Updated Nov 23, 2024

Personal Implementation of the paper: Nuvo: Neural UV Mapping for Unruly 3D Representations

Python 30 1 Updated Dec 12, 2024

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

149 7 Updated Mar 3, 2025

✨✨[CVPR 2025] Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

473 20 Updated Dec 14, 2024

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,803 443 Updated Jan 12, 2025

This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation", accepted by CVPR 2024.

64 Updated Jun 3, 2024

[ECCV 2024] Tokenize Anything via Prompting

Jupyter Notebook 565 25 Updated Dec 11, 2024
Python 104 2 Updated Jun 11, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 552 41 Updated May 8, 2024

This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model

Jupyter Notebook 93 9 Updated Jul 15, 2024

The repository for Hyperbolic Representation Learning for Computer Vision, ECCV 2022

Jupyter Notebook 62 5 Updated Oct 23, 2022

Curated list of awesome works on unsupervised object localization in 2D images.

70 2 Updated Aug 19, 2024

Set-of-Mark Prompting for GPT-4V and LMMs

Python 1,296 103 Updated Aug 19, 2024

Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)

Python 1,263 287 Updated Sep 7, 2023
Python 8,579 505 Updated Oct 9, 2024

[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI

Python 532 32 Updated Jan 17, 2024

[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"

Python 67 1 Updated Oct 15, 2024

Official repo for our ICML 23 paper: "Multi-Modal Classifiers for Open-Vocabulary Object Detection"

Python 88 7 Updated Jun 22, 2023

CoRL 2024

Python 382 49 Updated Oct 29, 2024
Next