This is Regularizing Visual Semantic Embedding with Contrastive Learning for Image-Text Matching, source code of ConVSE. This paper accepted by IEEE SPL. It is built on the top of the VSE$\infty$ in PyTorch.
We recommended the following dependencies.
- Python3.6+
- Pytorch 1.9.0+
Download the dataset files. We use the image feature created by SCAN, download here[https://github.com/kuanghuei/SCAN].
Run train.py
:
python train.py --data_path "$DATA_PATH" --data_name "$DATA_NAME" --vocab_paath "$VOCAB_PATH" --model_name "runs/convse/model/" --use_contrastive
from vocab import Vocabulary
import evalution
evalution.evalrank("$PATH/model_best.pth.tar", data_path="$DATA_PATH", split="test")
If you found this code useful, please cite the following paper:
@article{liu2022regularizing,
title={Regularizing Visual Semantic Embedding with Contrastive Learning for Image-Text Matching},
author={Liu, Yang and Liu, Hong and Wang, Huaqiu and Liu, Mengyuan},
journal={IEEE Signal Processing Letters},
year={2022},
publisher={IEEE}
}