Skip to content

Code for paper: Tag2Text: Guiding Vision-Language Model via Image Tagging

License

Notifications You must be signed in to change notification settings

jingzhengli/Tag2Text

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Tag2Text: Guiding Vision-Language Model via Image Tagging

Official PyTorch Implementation of the Tag2Text paper. We will fully open source the code and data in the future.

Welcome to try out Tag2Text Web demo🤗! Both Tagging and Captioning are included.

Abstract

Tag2Text achieves a superior image tag recognition ability by exploiting fine-grained text information. By leveraging tagging guidance, Tag2Text effectively enhances the performance of vision-language models on both generation-based and alignment-based tasks.

Citation

If you find our work to be useful for your research, please consider citing.

@article{huang2023tag2text,
  title={Tag2Text: Guiding Vision-Language Model via Image Tagging},
  author={Huang, Xinyu and Zhang, Youcai and Ma, Jinyu and Tian, Weiwei and Feng, Rui and Zhang, Yuejie and Li, Yaqian and Guo, Yandong and Zhang, Lei},
  journal={arXiv preprint arXiv:2303.05657},
  year={2023}
}

About

Code for paper: Tag2Text: Guiding Vision-Language Model via Image Tagging

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%