jingzhengli / Tag2Text Public

forked from xinyu1205/recognize-anything

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Code for paper: Tag2Text: Guiding Vision-Language Model via Image Tagging

tag2text.github.io

MIT license

0 stars 291 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
framework.png		framework.png

Repository files navigation

Tag2Text: Guiding Vision-Language Model via Image Tagging

Official PyTorch Implementation of the Tag2Text paper. We will fully open ource the code and data in the future.

Welcome to try out Tag2Text Web demo! It is integrated into Huggingface Spaces 🤗 using Gradio.

Abstract

Tag2Text achieves a superior image tag recognition ability by exploiting fine-grained text information. By leveraging tagging guidance, Tag2Text effectively enhances the performance of vision-language models on both generation-based and alignment-based tasks.