Skip to content
/ PDF Public

[ICML 2024] Official implementation for "Predictive Dynamic Fusion."

License

Notifications You must be signed in to change notification settings

Yinan-Xia/PDF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predictive-Dynamic-Fusion

This is the official implementation for Predictive Dynamic Fusion (ICML 2024) by Bing Cao, Yinan Xia, Yi Ding, Changqing Zhang, and Qinghua Hu.

Abstract

Multimodal fusion is crucial in joint decision-making systems for rendering holistic judgments. Since multimodal data changes in open environments, dynamic fusion has emerged and achieved remarkable progress in numerous applications. However, most existing dynamic multimodal fusion methods lack theoretical guarantees and easily fall into suboptimal problems, yielding unreliability and instability. To address this issue, we propose a predictive dynamic fusion (PDF) framework for multimodal learning. We proceed to reveal the multimodal fusion from a generalization perspective and theoretically derive the predictable Collaborative Belief (Co-Belief) with Mono- and Holo-Confidence, which provably reduces the upper bound of generalization error. Accordingly, we further propose a relative regularization strategy to calibrate the predicted Co-Belief for potential uncertainty. Extensive experiments on multiple benchmarks confirm our superiority.

Environment Installation

numpy==1.21.6
Pillow==9.4.0
pytorch_pretrained_bert==0.6.2
scikit_learn==1.0.2
torch==1.11.0+cu113
torchvision==0.12.0+cu113
tqdm==4.65.0

Dataset preparation

Step 1: Download food101 and MVSA_Single and put them in the folder datasets.

Step 2: Prepare the train/dev/test splits jsonl files. We follow the QMF settings and provide them in corresponding folders.

Step 3 (optional): If you want use Glove model for Bow model, you can download glove.840B.300d.txt and put it in the folder datasets/glove_embeds. For bert model, you can download bert-base-uncased and put in the root folder bert-base-uncased/.

Train

bash ./shells/batch_train_latefusion_pdf.sh

Tips: at the beginning of training, the output value of the confidence predictor may be minimal when batch size is small, and taking the log may be nan, which can be solved by reducing the learning rate or increasing the weight decay.

Test

bash ./shells/batch_test_latefusion_pdf.sh

Citation

@article{cao2024predictive,
  title={Predictive Dynamic Fusion},
  author={Cao, Bing and Xia, Yinan and Ding, Yi and Zhang, Changqing and Hu, Qinghua},
  journal={arXiv preprint arXiv:2406.04802},
  year={2024}
}

Acknowledgement

The code is inspired by Provable Dynamic Fusion for Low-Quality Multimodal Data.

About

[ICML 2024] Official implementation for "Predictive Dynamic Fusion."

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published