Skip to content
View tenaflyyy's full-sized avatar

Block or report tenaflyyy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Efficient Text-to-3D Generation via Semantic-enhanced Sparse-view Prompting with Hybrid Reconstruction

Python 4 1 Updated Sep 5, 2024

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Jupyter Notebook 2,244 152 Updated Dec 24, 2024

Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering

Python 13 1 Updated Jan 12, 2024

Adapter-Enhanced Hierarchical Cross-Modal Pre-training for Lightweight Medical Report Generation

Python 10 Updated Jan 20, 2025

Observation Driven Memory Synergistic Planning for Continuous Vision-Language Navigation

Python 9 1 Updated Jun 14, 2024

A consistent Med-VQA dataset, C-SLAKE , extended by Slake for further consistency assessment .

13 Updated Jan 12, 2024

Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering

Python 1 Updated Jan 12, 2024

Multigranularity Contrastive cross-modal collaborative Generation (MCG) model for Video QA

Python 11 2 Updated Dec 13, 2023

[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering

Python 126 24 Updated Oct 25, 2022

[EMNLP 2018] PyTorch code for TVQA: Localized, Compositional Video Question Answering

Python 171 32 Updated Oct 25, 2022

[NeurIPS 2021] Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

Python 45 7 Updated Apr 11, 2023

PyTorch implementation of ICLR 2020 paper "CLEVRER: CoLlision Events for Video REpresentation and Reasoning"

Python 114 26 Updated Nov 6, 2020
88 7 Updated Oct 19, 2022

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

Python 271 27 Updated May 23, 2023

VaLM: Visually-augmented Language Modeling. ICLR 2023.

Python 56 3 Updated Mar 6, 2023

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Jupyter Notebook 4,975 662 Updated Aug 5, 2024

The code of IJCAI2022 paper, Declaration-based Prompt Tuning for Visual Question Answering

Python 19 2 Updated May 10, 2022

[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning

Python 33 3 Updated Aug 26, 2022

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

1,148 105 Updated Aug 19, 2022
Python 191 14 Updated Apr 13, 2023

CPL: Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning

Python 59 5 Updated Apr 3, 2024

Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!

Python 38,645 7,336 Updated Nov 27, 2022

The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"

Python 1,605 220 Updated Apr 9, 2024

[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Jupyter Notebook 118 15 Updated Sep 29, 2023

End-to-End Object Detection with Transformers

Python 13,902 2,500 Updated Mar 12, 2024
Python 648 69 Updated Mar 4, 2024

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

Python 1 Updated Jul 27, 2021

Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)

Python 1 Updated May 6, 2020

Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)

Python 132 26 Updated Jul 25, 2024
Next