RAG Example

Overview

Retrieval-Augmented Generation (RAG) is now the mainstream Generative AI technique. It allows you to use your private content with a public LLM to answer questions about your documents. The idea is to send 'context', or a snippet of your overall document collection, to the LLM - helping it answer your query. This works in the following fashion:

Prep
1. Identify a webpage with a list of links to PDF documents I used this.
2. Break the documents apart into chunks. You need to do that to ensure you do not exceed the LLM's input capacity.
3. Create vector embeddings - a numeric representation of the text - for each chunk. You create these vector using dedicated models known as 'embedding models'. These models convert text into number vectors.
4. Load these vector embeddings into a vector database. In our case we also loaded each vector's corresponding text chunk.
Search the vector database for the information you want. (Vector databases are used because they are really great for searching)
The vector database will either return you the text chunks that are relevant to your question (or point at the documents you will want to use).
1. These will be the text chunks/documents that you will send to the LLM with your question to help it compose an answer for you.
Send a prompt containing the question and the context (search results) to the LLM.
The LLM will return an answer that is relevant to your question.

Sources

This tutorial is based on a conglomeration of content from the following sources

ChatGPT helped me write chapters 1 and 2.
Weaviate's documentation.
Pixegami's YouTube tutorial.
Ollama's blog post about local embedding.

Making that happen: what do you need?

This tutorial ran using Python 3.12.7 and tested on Fedora Linux Release 41 and MacOS Sonoma 14.7.

A collection of documents - PDFs are good
Install
- FAISS - an in-memory vector database from Meta
- Docker
- Weaviate (depends on Docker)
- PyTorch
- Ollama (for running a local embedding model)
- Jupyter Lab (for running my Python examples)
An OpenAI API (not ChatGPT) account with a funding source. I spent $2 in total on this demo.
- Set up an environment variable for your OpenAI API key under the name OPENAI_API_KEY
- If you need help setting up an environment variable, look here
I rely on wget to download documents.

I created the demo on Fedora Linux and installing all of these was very simple. What the world is coming to?

Getting up and running

Use pip to install Jupyter or Jupyter-ai.
Install the Python dependencies using pip install -r requirements.txt.
Install Docker (on Linux, this is as complex as sudo dnf install docker for other OSes.
1. Our setup requires you to use a minimally customized Weaviate Docker container. For that reason you need to be sure Docker Compose is installed.
2. Installing this on Fedora with dnf was not working but installing Compose as a plugin did.
With Docker ready, you will need to use the docker-compose.yml file I have in the project, which enables the use of OpenAI seamlessly with Weaviate.
1. This can save you time and get you up and running muuuuuuch faster.
2. You can use another LLM provider but your mileage will vary.
From the command line, run jupyter lab.
For the sections that discuss local execution (chapters 4 and 5):
1. Install Ollama.
2. Download and install the Nomic embedding model using the Ollama command-line tool.

Follow the tutorial from rag1 onward! I hope you enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.ipynb_checkpoints		.ipynb_checkpoints
README.md		README.md
docker-compose.yml		docker-compose.yml
openai-test.ipynb		openai-test.ipynb
rag1.ipynb		rag1.ipynb
rag2.ipynb		rag2.ipynb
rag3.ipynb		rag3.ipynb
rag4.ipynb		rag4.ipynb
rag5.ipynb		rag5.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Example

Overview

Sources

Making that happen: what do you need?

Getting up and running

About

Releases

Packages

Languages

yzukerman/RAGGY

Folders and files

Latest commit

History

Repository files navigation

RAG Example

Overview

Sources

Making that happen: what do you need?

Getting up and running

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages