Stars
[ECCV 2024] The official PyTorch implementation of the "Part2Object: Hierarchical Unsupervised 3D Instance Segmentation".
Official implementation of paper VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance. Accepted to ACL 2024.
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
Retrieval and Retrieval-augmented LLMs
Awesome Online Action Detection
[ECCV 2024] The official PyTorch implementation of the "Plain-Det: A Plain Multi-Dataset Object Detector".
Vector (and Scalar) Quantization, in Pytorch
[CVPR2024] OneFormer3D: One Transformer for Unified Point Cloud Segmentation
OpenMMLab's next-generation platform for general 3D object detection.
MambaOut: Do We Really Need Mamba for Vision?
Official implementation of "Spherical Mask: Coarse-to-Fine 3D Point Cloud Instance Segmentation with Spherical Representation"
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
K-Quant: A Platform of Temporal Financial Knowledge-enhanced Quantitative Investment
A little spider which can help you to get your own paper list from https://arxiv.org/ every day.
Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.
Pointcept: a codebase for point cloud perception research. Latest works: PTv3 (CVPR'24 Oral), PPT (CVPR'24), OA-CNNs (CVPR'24), MSC (CVPR'23)