🤖BotPDF

A simple Large Language Model (LLM) chatbot project, where users can upload PDF files to receive tailored responses generated directly from the document contents. Built using open-source tools and technologies in a controlled, local environment since not a lot of people have access to OpenAI API keys and other paid options. This is also to eliminate reliance on cloud services and provide an easier local set-up that should just work.

What I learned

How to use other forms of input for the LLM other than text prompts
How we can customize context for the LLM via Retrieval Augmented Generation (RAG) without having to train your own LLM
How RAG works by utilizing vector embeddings, which is a way to represent the semantics of words in a numerical form
How to build a RAG pipeline using open-source tools and technologies

Tech Stack

streamlit - frontend
ollama - run open-source LLMs locally
- llama2 - LLM
llama_index - RAG framework
chromadb - vector database

Setup

After cloning the repository, follow the steps below to install the dependencies and run the app:

Run pip install -r requirements.txt
Download and install ollama
Run ollama pull llama2
Run ollama serve
Run streamlit run frontend_chatbot.py in the command line

Note: If you want to quickly run the app with an empty knowledge base (i.e. forget the previously uploaded PDFs), you can run reset.bat for Windows or reset.sh for Unix-like systems. Alternatively, you can manually delete the contents of the data folder and delete the chroma_db_data folder entirely.

Demo

Upload your PDF

pdf_upload.mp4

Ask away

pdf_query.mp4

Multiple uploads

multi_upload.mp4

Cross-reference your PDFs

crossref_pdf.mp4

Feedback

All manner of feedback is highly appreciated. I am relatively new to this and I would love to hear your comments and suggestions as part of the learning experience. Thank you!

Name	Name	Last commit message	Last commit date
Latest commit casie-aviles Update README.md Mar 3, 2024 2e2f56a · Mar 3, 2024 History 27 Commits
.gitignore	.gitignore	Update .gitignore to include chroma_db_data folder	Mar 2, 2024
README.md	README.md	Update README.md	Mar 3, 2024
frontend_chatbot.py	frontend_chatbot.py	Update LLM response timing to display time on frontend instead of pri…	Mar 2, 2024
llm_interface.py	llm_interface.py	Update LLM response timing to display time on frontend instead of pri…	Mar 2, 2024
requirements.in	requirements.in	Update requirements.txt with pinned dependencies via pip-compile	Mar 2, 2024
requirements.txt	requirements.txt	Update requirements.txt with pinned dependencies via pip-compile	Mar 2, 2024
reset.bat	reset.bat	Update reset.bat to use relative paths instead	Mar 2, 2024
reset.sh	reset.sh	Add reset.sh script for unix-like systems	Mar 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖BotPDF

What I learned

Tech Stack

Setup

Demo

Upload your PDF

Ask away

Multiple uploads

Cross-reference your PDFs

Feedback

About

Releases

Packages

Languages

casie-aviles/botpdf-llama2-chatbot

Folders and files

Latest commit

History

Repository files navigation

🤖BotPDF

What I learned

Tech Stack

Setup

Demo

Upload your PDF

Ask away

Multiple uploads

Cross-reference your PDFs

Feedback

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages