Skip to content
forked from cxcscmu/RAGViz

Official repository for RAGVIZ: Diagnose and Visualize Retrieval-Augmented Generation

License

Notifications You must be signed in to change notification settings

techthiyanes/RAGViz

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

drawing

RAGViz

RAGViz (Retrieval Augmented Generation Visualization) is a tool that visualizes both document and token-level attention on the retrieved context feeded to the LLM to ground answer generation.

  • RAGViz provides an add/remove document functionality to compare the generated tokens when certain documents are not included in the context.
  • Combining both functionalities allows for a diagnosis on the effectiveness and influence of certain retrieved documents or sections of text on the LLM's answer generation.

Demo Video

A basic demonstration of RAGViz is available here.

Configuration

The following are the system configurations of our RAGViz demonstration:

  • The Pile-CC English documents are used for retrieval
  • Documents are partioned into 4 DiskANN indexes on separate nodes, each with ~20 million documents
  • Documents are embedded into feature vectors using AnchorDR
  • LLaMa2 generation/attention output done with vLLM and HuggingFace transformers library
  • Frontend UI is adapted from Lepton search engine

Customization

Snippets:

You can modify the snippets used for context in RAG by adding a new file and class in backend/snippet, adding it to backend/ragviz.py and frontend/src/app/components/search.tsx. We currently offer the following snippets:

  • Naive First:
    • Represent a document with its first 128 tokens
  • Sliding Window
    • Compute inner product similarity between windows of 128 tokens and the query; use the most similar window to the query to represent a document

Datasets:

New datasets for retrieval can be added using a new file and class in backend/search, and modifying backend/ragviz.py accordingly.

We currently have implemented both a implementation the following datasets:

  • Clueweb22B english documents
  • Pile-CC dataset

LLMs:

Any model supported by HuggingFace transformers library can be used as the LLM backbone.

To apply vLLM for fast inference, the LLM backbone needs to be supported by vLLM. A list of vLLM supported model is available here.

You can set the model path of the model for RAG inside of backend/.env.example. We used meta-llama/Llama-2-7b-chat-hf for the demo.

About

Official repository for RAGVIZ: Diagnose and Visualize Retrieval-Augmented Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 56.7%
  • Python 41.8%
  • Other 1.5%