This repo is an implementation of a chatbot specifically focused on question answering over the LangChain documentation.
For a high level overview of why we built this, see our blog.
There are two components: ingestion and question-answering.
Ingestion has the following steps:
- Pull html from documentation site
- Parse html with BeautifulSoup
- Split documents with LangChain's TextSplitter
- Create a vectorstore of embeddings, using LangChain's vectorstore wrapper (with OpenAI's embeddings and Weaviate's vectorstore)
Question-Answering has the following steps:
- Given the chat history and new user input, determine what a standalone question would be (using GPT-3)
- Given that standalone question, look up relevant documents from the vectorstore
- Pass the standalone question and relevant documents to GPT-3 to generate a final answer
Coming soon.