A Retrieval-Augmented Generation (RAG) service for document search and generation, providing a simple and efficient way to build a question-answering system powered by your own documentation.
- Vector-based document search using OpenAI embeddings and FAISS
- LLM-powered document question-answering with context retrieval
- Simple CLI interface for document indexing and testing
- Interactive web UI built with Gradio
- No project-specific code - works with any documentation structure
- Python 3.9+
- OpenAI API key
# Clone the repository
git clone https://github.com/lablup/RAGModelService.git
cd RAGModelService
# Install the package in development mode
pip install -e .
pip install -r requirements.txt
python-dotenv
gradio
- Create a
.env
file in the root directory based on.env_example
:
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4o # or another OpenAI model
TEMPERATURE=0.2
MAX_RESULTS=5
Index your documents to create a vector store:
# Using the CLI tool with the installed package
rag-vectorize create --docs-path /path/to/your/documents
# Or run the module directly
python -m vectordb_manager.cli_vectorizer create --docs-path /path/to/your/documents
Test the RAG system with a simple command-line interface:
# Using the installed CLI tool
rag-chat
# Or run the module directly
python -m app.rag_chatbot
Launch the Gradio web interface:
# Using the installed CLI tool
rag-web
# Or run the module directly
python -m app.gradio_app
Test the vector database functionality:
# Run the VectorDBManager test interface
python -m vectordb_manager.vectordb_manager
- vectordb_manager: Handles document collection, vectorization, and storage
- app/rag_chatbot.py: Implements the RAG system core functionality
- app/gradio_app.py: Provides a web interface using Gradio
- app/document_filter.py: Simple document filtering utility
MIT