Skip to content

mikun19/Receipts_VQA

Repository files navigation

💬 Receipts_VQA 📖

Introduce

In this project, We will using LLMs LLaVA and RAG to build a VQA Chatbot.

Results are available at Jira

Overview of the app

Installation and Requirements

  1. Clone project:
git clone ...
cd Receipts_VQA/
  1. Create conda environments:
conda create --name < ENV_NAME > python=3.11 -y
conda activate < ENV_NAME >
  1. Install torch

  2. Run this command to install dependenies in the requirements.txt file

pip install -r requirements.txt

Run Project

  1. Run the streamlit server
streamlit run app.py
  1. Access the application in your browser at [http://localhost:8501].

  2. Start chatting with the assistant!

How it works

The app as follows:

  1. The user enters an image in the upload image field.

  2. User enters a question about uploaded image.

  3. User messages are sent to the OCR and LLaVA model for processing.

  4. The user's input, along with the chat history, is used to generate a response.

  5. The LLaVA model generates a response based on the patterns it learned during training.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages