Skip to content

πŸ₯€ RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

License

Notifications You must be signed in to change notification settings

zitro677/raglite

Repository files navigation

Open in Dev Containers Open in GitHub Codespaces

πŸ₯€ RAGLite

RAGLite is a Python package for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite.

Features

  1. ❀️ Only lightweight and permissive open source dependencies (e.g., no PyTorch, LangChain, or PyMuPDF)
  2. 🧠 Choose any LLM provider with LiteLLM, including local llama-cpp-python models
  3. πŸ’Ύ Either PostgreSQL or SQLite as a keyword & vector search database
  4. πŸš€ Acceleration with Metal on macOS, and CUDA on Linux and Windows
  5. πŸ“– PDF to Markdown conversion on top of pdftext and pypdfium2
  6. 🧬 Multi-vector chunk embedding with late chunking and contextual chunk headings
  7. βœ‚οΈ Optimal level 4 semantic chunking by solving a binary integer programming problem
  8. πŸŒ€ Optimal closed-form linear query adapter by solving an orthogonal Procrustes problem
  9. πŸ” Hybrid search that combines the database's built-in keyword search (tsvector in PostgreSQL, FTS5 in SQLite) with their native vector search extensions (pgvector in PostgreSQL, sqlite-vec in SQLite)
  10. ✍️ Optional: conversion of any input document to Markdown with Pandoc
  11. βœ… Optional: evaluation of retrieval and generation performance with Ragas

Installing

To install this package (including Metal acceleration if on macOS), run:

pip install raglite

To add CUDA 12.4 support, use the cuda124 extra:

pip install raglite[cuda124]

To add support for filetypes other than PDF, use the pandoc extra:

pip install raglite[pandoc]

To add support for evaluation, use the ragas extra:

pip install raglite[ragas]

Using

Overview

  1. Configuring RAGLite
  2. Inserting documents
  3. Searching and Retrieval-Augmented Generation (RAG)
  4. Computing and using an optimal query adapter
  5. Evaluation of retrieval and generation

1. Configuring RAGLite

Tip

🧠 RAGLite extends LiteLLM with support for llama.cpp models using llama-cpp-python. To select a llama.cpp model (e.g., from bartowski's collection), use a model identifier of the form "llama-cpp-python/<hugging_face_repo_id>/<filename>@<n_ctx>", where n_ctx is an optional parameter that specifies the context size of the model.

Tip

πŸ’Ύ You can create a PostgreSQL database for free in a few clicks at neon.tech (not sponsored).

First, configure RAGLite with your preferred PostgreSQL or SQLite database and any LLM supported by LiteLLM:

from raglite import RAGLiteConfig

# Example 'remote' config with a PostgreSQL database and an OpenAI LLM:
my_config = RAGLiteConfig(
    db_url="postgresql://my_username:my_password@my_host:5432/my_database"
    llm="gpt-4o-mini",  # Or any LLM supported by LiteLLM.
    embedder="text-embedding-3-large",  # Or any embedder supported by LiteLLM.
)

# Example 'local' config with a SQLite database and a llama.cpp LLM:
my_config = RAGLiteConfig(
    db_url="sqlite:///raglite.sqlite",
    llm="llama-cpp-python/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/*Q4_K_M.gguf@8192",
    embedder="llama-cpp-python/lm-kit/bge-m3-gguf/*F16.gguf",
)

2. Inserting documents

Tip

✍️ To insert documents other than PDF, install the pandoc extra with pip install raglite[pandoc].

Next, insert some documents into the database. RAGLite will take care of the conversion to Markdown, optimal level 4 semantic chunking, and multi-vector embedding with late chunking:

# Insert documents:
from pathlib import Path
from raglite import insert_document

insert_document(Path("On the Measure of Intelligence.pdf"), config=my_config)
insert_document(Path("Special Relativity.pdf"), config=my_config)

3. Searching and Retrieval-Augmented Generation (RAG)

Now, you can search for chunks with keyword search, vector search, or a hybrid of the two. You can also answer questions with RAG and the search method of your choice (hybrid is the default):

# Search for chunks:
from raglite import hybrid_search, keyword_search, vector_search

prompt = "How is intelligence measured?"
results_vector = vector_search(prompt, num_results=5, config=my_config)
results_keyword = keyword_search(prompt, num_results=5, config=my_config)
results_hybrid = hybrid_search(prompt, num_results=5, config=my_config)

# Answer questions with RAG:
from raglite import rag

prompt = "What does it mean for two events to be simultaneous?"
stream = rag(prompt, search=hybrid_search, config=my_config)
for update in stream:
    print(update, end="")

4. Computing and using an optimal query adapter

RAGLite can compute and apply an optimal closed-form query adapter to the prompt embedding to improve the output quality of RAG. To benefit from this, first generate a set of evals with insert_evals and then compute and store the optimal query adapter with update_query_adapter:

# Improve RAG with an optimal query adapter:
from raglite import insert_evals, update_query_adapter

insert_evals(num_evals=100, config=my_config)
update_query_adapter(config=my_config)

5. Evaluation of retrieval and generation

If you installed the ragas extra, you can use RAGLite to answer the evals and then evaluate the quality of both the retrieval and generation steps of RAG using Ragas:

# Evaluate retrieval and generation:
from raglite import answer_evals, evaluate, insert_evals

insert_evals(num_evals=100, config=my_config)
answered_evals_df = answer_evals(num_evals=10, config=my_config)
evaluation_df = evaluate(answered_evals_df, config=my_config)

Contributing

Prerequisites
1. Set up Git to use SSH
  1. Generate an SSH key and add the SSH key to your GitHub account.
  2. Configure SSH to automatically load your SSH keys:
    cat << EOF >> ~/.ssh/config
    
    Host *
      AddKeysToAgent yes
      IgnoreUnknown UseKeychain
      UseKeychain yes
      ForwardAgent yes
    EOF
2. Install Docker
  1. Install Docker Desktop.
3. Install VS Code or PyCharm
  1. Install VS Code and VS Code's Dev Containers extension. Alternatively, install PyCharm.
  2. Optional: install a Nerd Font such as FiraCode Nerd Font and configure VS Code or configure PyCharm to use it.
Development environments

The following development environments are supported:

  1. ⭐️ GitHub Codespaces: click on Code and select Create codespace to start a Dev Container with GitHub Codespaces.
  2. ⭐️ Dev Container (with container volume): click on Open in Dev Containers to clone this repository in a container volume and create a Dev Container with VS Code.
  3. Dev Container: clone this repository, open it with VS Code, and run Ctrl/⌘ + ⇧ + P β†’ Dev Containers: Reopen in Container.
  4. PyCharm: clone this repository, open it with PyCharm, and configure Docker Compose as a remote interpreter with the dev service.
  5. Terminal: clone this repository, open it with your terminal, and run docker compose up --detach dev to start a Dev Container in the background, and then run docker compose exec dev zsh to open a shell prompt in the Dev Container.
Developing
  • This project follows the Conventional Commits standard to automate Semantic Versioning and Keep A Changelog with Commitizen.
  • Run poe from within the development environment to print a list of Poe the Poet tasks available to run on this project.
  • Run poetry add {package} from within the development environment to install a run time dependency and add it to pyproject.toml and poetry.lock. Add --group test or --group dev to install a CI or development dependency, respectively.
  • Run poetry update from within the development environment to upgrade all dependencies to the latest versions allowed by pyproject.toml.
  • Run cz bump to bump the package's version, update the CHANGELOG.md, and create a git tag.

About

πŸ₯€ RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.0%
  • Dockerfile 3.0%