Stars
Scripts for batch rendering models using Blender. Tested with models from stanfords shapenet library.
Repo for the book Discover three.js!
(Arxiv 2023) Optimized View and Geometry Distillation from Multi-view Diffuser
Downstream-Dino-V2: A GitHub repository featuring an easy-to-use implementation of the DINOv2 model by Facebook for downstream tasks such as Classification, Semantic Segmentation and Monocular dept…
Refine high-quality datasets and visual AI models
Two paper About robot navigation in dynamic environment
[AAAI 2023] An official source code for paper Cluster-guided Contrastive Graph Clustering Network.
Transferable Decoding with Visual Entities for Zero-Shot Image Captioning, ICCV 2023
The code for the paper "Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models" (ICCV'23).
The implementation of our ACM MM 2023 paper "AdvCLIP: Downstream-agnostic Adversarial Examples in Multimodal Contrastive Learning"
The implementation of our ICCV 2023 paper "Downstream-agnostic Adversarial Examples"
[ACM MM 2023] An official source code for paper "CONVERT: Contrastive Graph Clustering with Reliable Augmentation".
[ACM MM 2023] An official source code for paper "DealMVC: Dual Contrastive Calibration for Multi-view Clustering"
(ICCV 2023) NeMF: Inverse Volume Rendering with Neural Microflake Field
EMNLP22: Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem
Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow
[ICCV 2023 Oral]: Scaling Data Generation in Vision-and-Language Navigation
The Pytorch implementation of Grounding 3D Object Affordance from 2D Interactios in Images.
[CVPR2023] Self-supervised Implicit Glyph Attention for Text Recognition
[ICCV2023] Self-supervised Character-to-Character Distillation for Text Recognition
SkeletonMAE: Graph-based Masked Autoencoder for Skeleton Sequence Pre-training
HCPLab-SYSU / CMCIR
Forked from YangLiu9208/CMCIR[IEEE T-PAMI 2023] Cross-Modal Causal Relational Reasoning for Event-Level Visual Question Answering
CausalVLR: A Toolbox and Benchmark for Visual-Linguistic Causal Reasoning (视觉-语言因果推理开源框架)
[ICCV2023] XNet: Wavelet-Based Low and High Frequency Merging Networks for Semi- and Supervised Semantic Segmentation of Biomedical Images
[BMVC2023] Spatial and Planar Consistency for Semi-Supervised Volumetric Medical Image Segmentation
[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.