Welcome to the StackUp Query Assistant repository! This project aims to help users with their questions about the StackUp website by providing quick and relevant answers. The assistant uses information gathered from the StackUp Zendesk help center to ensure that users get the support they need. This help reduce the load on the Operations team and the time req. to resolve Simple Issues.
- Data Scraping: Extracts data from the StackUp Zendesk page.
- Data Cleaning: Processes and cleans the scraped data for embedding.
- Data Embedding: Stores the cleaned data in MongoDB Atlas for efficient retrieval.
- Query Handling: Uses RAG-based LLM to generate responses based on retrieved data.
- Programming Query Rejection: If users ask technical or programming-related questions, the LLM will deny answering to ensure the integrity of StackUp's platform policies.
├── get_and_clean_data_programs
│ ├── get-data.py
│ ├── clean-data.py
├── create-embedding-main.ipynb
├── data
│ ├── cleaned_data.txt
├── StackUp_Bot_Chatflow.json
├── requirements.txt
├── index.html
└── README.md
- get_and_clean_data_programs/get-data.py: Scrapes data from the StackUp Zendesk page and saves it as a JSON file.
- get_and_clean_data_programs/clean-data.py: Cleans the scraped JSON data and converts it into a text file.
- create-embedding-main.ipynb: Embeds the cleaned text data into MongoDB Atlas using the
togethercomputer/m2-bert-80M-8k-retrieval
model. - StackUp_Bot_Chatflow.json: Contains a pipeline for Flowise AI, which integrates the model into the website.
- Python 3.x
- MongoDB Atlas account
- Required Python packages (listed in
requirements.txt
)
-
Clone the repository:
git clone https://github.com/yourusername/stackup-query-assistant.git cd stackup-query-assistant
-
Install the required packages:
pip install -r requirements.txt
-
Create a .env file and store ATLAS_URI & TOGETHER_API_KEY.
Run the get-data.py
script to scrape data from the StackUp Zendesk page:
python get_and_clean_data_programs/get-data.py
Run the clean-data.py
script to clean the scraped data:
python get_and_clean_data_programs/clean-data.py
-
Open and run the
create-embedding-main.ipynb
notebook to embed the cleaned data into MongoDB Atlas. -
Navigate to Altas Search and Create a new Vector Search with name "idx_embedding" and the following body:
{
"fields": [
{
"numDimensions": 768,
"path": "embedding",
"similarity": "cosine",
"type": "vector"
}
]
}
- Review and Create Vector Search.
-
Install Flowise locally using NPM:
npm install -g flowise
-
Start Flowise:
npx flowise start
-
Open Flowise at http://localhost:3000.
-
Navigate to Workflows and click on the "Add New" button.
-
Click on the gear icon, load the workflow by uploading
StackUp_Bot_Chatflow.json
. -
Input all required fields such as OpenAI API Key, MongoDB connection URI, database name, collection name, and index name, then save it.
-
Click on the code icon ("</>") and insert the generated script code into the
<body>
tag of your website.
StackUpRAG-demo.mp4
The StackUp Query Assistant is designed to maintain the integrity of StackUp's learn-to-earn platform. If users ask any technical or programming-related questions, the LLM will deny answering them. Similarly, if users ask about content not present in the knowledge base, it will also deny answering. This ensures that users complete their coding challenges and quests independently.