Repository for ChatBot Design Application exercise.

This repository holds the implementation for a chatbot that allows user to enter a message and returns the best matching image from the given image dataset:

Dataset: jmhessel/newyorker_caption_contest · Datasets at Hugging Face

All codes are implemented in Python and tested using single NVIDIA RTX A5000 machine.

Note that for text-to-image retrieval task, a pretrained Vision-Language model is adopted. Here, we use BLIP [blog]. This repository is built upon the official Pytorch implementation of the BLIP [https://github.com/salesforce/BLIP/tree/main]. After forking the BLIP repository, the python script 'chatbot.py' for ChatBot program has been implemented. The code has been tested on PyTorch 1.10.
Once again, note that only 'chatbot.py' script in the folder is newly implemented.

To run the model, please refer to the Requirements.

Requirements

To install dependencies for running the application, please execute the following line.

pip install -r requirements.txt

Dataset

Direct link to the 'newyorker_caption_contest' dataset is here. The dataset is composed of 'train', 'test', and 'validation' sets. Each set contains 2,340, 131, and 130 images, respectively. Here, we note that only 'train' set is used for parsing image.

Model

We used pretrained version of BLIP with ViT-B backbone, finetuned on the COCO dataset. The checkpoint can be found here.

Code references and guidelines to new codes

Entire implementation of the BLIP model is borrowed from the official pytorch implementation of BLIP. Using BLIP as a feature extractor for both image and text, the ChatBot which does text-to-image retrieval task is implemented.

Note that running BLIP for extracting image features require 3500MiB GPU memory. Also, it takes approximately 2 seconds for extracting features from each image. Considering the expensive time for feature extraction process, I prepared a preprocessed data in advance, which is composed of feature vectors extracted from the whole 'train' set (2,340 images). The preprocessed data is in the folder, under the name of 'newyorker_caption_contest.pt'.

However, user can always decide whether to use this preprocessed data or not. If user choose not to, then the feature extracting process will begin promptly during running the program.

Testing

To run the application, please run:

python chatbot.py

Upon running, the program will ask whether to use preprocessed dataset or not. If answered 'yes', the program will begin preprocessing the dataset. If answered 'no', the program will skip the feature extraction part and ask user for prompt input.
The user will be asked to input a message. Receiving a message, the program will promptly find a best matching image from the dataset and show the image. At the same time, the found image will be saved at the './chatbot_results' forder, under the name of user input.
After returning the image, the program will ask the user to continue or not. If answered 'yes', the user is asked again for the new message. If answered 'no', the program will end.

The running screen would look like this:

Some examples of the images retrieved from user prompts (Prompt written at the top of the image):

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
chatbot_results		chatbot_results
configs		configs
data		data
models		models
preview_images		preview_images
transform		transform
BLIP.gif		BLIP.gif
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
chatbot.py		chatbot.py
cog.yaml		cog.yaml
demo.ipynb		demo.ipynb
eval_nocaps.py		eval_nocaps.py
eval_retrieval_video.py		eval_retrieval_video.py
newyorker_caption_contest.pt		newyorker_caption_contest.pt
predict.py		predict.py
pretrain.py		pretrain.py
requirements.txt		requirements.txt
test.png		test.png
train_caption.py		train_caption.py
train_nlvr.py		train_nlvr.py
train_retrieval.py		train_retrieval.py
train_vqa.py		train_vqa.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Repository for ChatBot Design Application exercise.

Requirements

Dataset

Model

Code references and guidelines to new codes

Testing

About

Releases

Packages

Languages

License

seunghyunni/Chatbot_Exercise

Folders and files

Latest commit

History

Repository files navigation

Repository for ChatBot Design Application exercise.

Requirements

Dataset

Model

Code references and guidelines to new codes

Testing

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages