Sophisticated Controllable Agent for Complex RAG Tasks 🧠📚

This repository showcases an advanced RAG (Retrieval-Augmented Generation) solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve. The solution uses a sophisticated deterministic graph, which acts as the "brain" 🧠 of a highly controllable autonomous agent capable of answering non-trivial questions from your own data. The only elements treated as black boxes are the outputs of LLM-based (Large Language Model) functions, and these are closely monitored. Given that the data is yours, it is crucial to ensure that the answers provided are solely based on this data, avoiding any hallucinations.

To achieve this, the algorithm was tested using a familiar use case: the first book of Harry Potter. This choice allows for monitoring whether the model relies on its pre-trained knowledge or strictly on the retrieved information from vector stores.

An indicator of the algorithm's reliability is its ability to fail at answering questions not found in the context. When it successfully answers a question derived from the context, it confirms that the solution is based on the actual data provided. If it fails, it is because the answer could not be deduced from the available data.

Example of Such Questions 🔍

How did the protagonist defeat the villain's assistant?

To solve this question, the following steps are necessary:

Identify the protagonist of the plot.
Identify the villain.
Identify the villain's assistant.
Search for confrontations or interactions between the protagonist and the villain.
Deduce the reason that led the protagonist to defeat the assistant.

General Project Flow 📈

PDF Loading and Processing: Load PDF documents and split them into chapters.
Text Preprocessing: Clean and preprocess the text for better summarization and encoding.
Summarization: Generate extensive summaries of each chapter using large language models.
Book Quotes Database Creation: Create a database for specific questions that will need access to quotes from the book.
Vector Store Encoding: Encode the book content and chapter summaries into vector stores for efficient retrieval.
Define Various LLM-Based Functions: Define functions for the graph pipeline, including planning, retrieval, answering, replanning, content distillation, hallucination checking, etc.
Graph-Based Workflow: Utilize a state graph to manage the workflow of tasks.
Performance Evaluation: Evaluate how well the whole solution solved complicated questions.

Algorithm Represented by the Graph 🔗

Start with the user's question.
Anonymize the question by replacing named entities with variables and storing the mapping.
Generate a high-level plan to answer the anonymized question using a language model.
De-anonymize the plan by replacing the variables with the original named entities.
Break down the plan into individual tasks that involve either retrieving relevant information or answering a question based on a given context.
For each task in the plan:
a. Decide whether to retrieve information from chunks db, chapter summaries db, book quotes db, or answer a question based on the task and the current context.
b. If retrieving information:
i. Retrieve relevant information from one of the vector stores based on the task.
ii. Distill the retrieved information to keep only the relevant content.
iii. Verify that the distilled content is grounded in the original context. If not, distill the content again.
c. If answering a question:
i. Answer the question based on the current context using a language model, using chain of thought.
ii. Verify that the generated answer is grounded in the context. If not, answer the question again.
d. After retrieving or answering, re-plan the remaining steps based on the updated context.
e. Check if the original question can be answered with the current context. If so, proceed to the final answer step. Otherwise, continue with the next task in the plan.
Generate the final answer to the original question based on the accumulated context, using chain of thought.
Verify that the final answer is grounded in the context. If not, generate the final answer again.
Output the final answer to the user.

Heuristics and Techniques Implemented in This Solution ⚙️

Encoding both book content in chunks, chapter summaries generated by LLM, and quotes from the book.
Anonymizing the question to create a general plan without biases or pre-trained knowledge of any LLM involved.
Breaking down each task from the plan to be executed by custom functions with full control.
Distilling retrieved content for better and accurate LLM generations, minimizing hallucinations.
Answering a question based on context using a Chain of Thought, which includes both positive and negative examples, to arrive at a well-reasoned answer rather than just a straightforward response.
Content verification and hallucination-free verification as suggested in "Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection" - https://arxiv.org/abs/2310.11511.
Utilizing an ongoing updated plan made by an LLM to solve complicated questions. Some ideas are derived from "Plan-and-Solve Prompting" - https://arxiv.org/abs/2305.04091 and the "babyagi" project - https://github.com/yoheinakajima/babyagi.
Evaluating the model's performance using Ragas metrics like answer correctness, faithfulness, relevancy, recall, and similarity to ensure high-quality answers.

Relevant Links 🔗

🎥 Lecture Video
📝 Medium Article

Installation 🛠️

Clone the repository:

git clone https://github.com/NirDiamant/Deterministic-RAG-Agent.git

Install the required Python packages:
```
pip install -r requirements.txt
```
Set up your environment variables (The models I used were trying to minimize the cost, but this is just a suggestion): Create a .env file in the root directory and add your API keys:
```
OPENAI_API_KEY=your_openai_api_key
GROQ_API_KEY=your_groq_api_key
```
Step-by-Step Tutorial Notebook: The step-by-step tutorial notebook is sophisticated_rag_agent_harry_potter.ipynb.
Run a Real-Time Visualization of the Agent:
- Open CMD and type: streamlit run simulate_agent.py

Contributing 🤝

Contributions are welcome! Please feel free to submit a pull request or open an issue for any suggestions or improvements.

Special Thanks 🎉

Special thanks to Elad Levi for the great advising and ideas.

License 📜

This project is licensed under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
graphs		graphs
LICENSE		LICENSE
README.md		README.md
full_graph_visualization.ipynb		full_graph_visualization.ipynb
functions_for_pipeline.py		functions_for_pipeline.py
helper_functions.py		helper_functions.py
requirements.txt		requirements.txt
simulate_agent.py		simulate_agent.py
sophisticated_rag_agent_harry_potter.ipynb		sophisticated_rag_agent_harry_potter.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sophisticated Controllable Agent for Complex RAG Tasks 🧠📚

Example of Such Questions 🔍

How did the protagonist defeat the villain's assistant?

General Project Flow 📈

Suggested Deterministic Agent Solution Schema 💡

Algorithm Represented by the Graph 🔗

Heuristics and Techniques Implemented in This Solution ⚙️

Relevant Links 🔗

Installation 🛠️

Contributing 🤝

Special Thanks 🎉

License 📜

About

Releases

Packages

Languages

License

bld84/Controllable-RAG-Agent

Folders and files

Latest commit

History

Repository files navigation

Sophisticated Controllable Agent for Complex RAG Tasks 🧠📚

Example of Such Questions 🔍

How did the protagonist defeat the villain's assistant?

General Project Flow 📈

Suggested Deterministic Agent Solution Schema 💡

Algorithm Represented by the Graph 🔗

Heuristics and Techniques Implemented in This Solution ⚙️

Relevant Links 🔗

Installation 🛠️

Contributing 🤝

Special Thanks 🎉

License 📜

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages