Skip to content

Vantiq/chat-langchain

 
 

Repository files navigation

🦜️🔗 ChatLangChain

This repo is an implementation of a locally hosted chatbot specifically focused on question answering over the LangChain documentation. Built with LangChain and FastAPI.

The app leverages LangChain's streaming support and async API to update the page in real time for multiple users.

Vantiq Notes:

This is a fork of the original ChatLangChain project. It provides a good example of using LangChain to solve a "semantic search" problem over specific content. The following changes have been made:

  1. To run any of the code you will want to create a .env file at the root of the repo. This file should have the following contents: OPENAI_API_KEY=<Vantiq OpenAI key>. The location of this key has been sent to the Vantiq AI teams channel.
  2. ingest.py has been changed to ingest the rules.md file from our documentation. The code assumes that this repo was cloned as peer to our docs repo. It creates a file called rules.pkl that is used by the app. This file is not checked in, but a pointer to a copy was sent to the Teams channel (generation of the file does cost money, though I'm not sure how much at this point).
  3. Some of the code has been changed to address API changes since this was originally published.

We recommend the use of a virtual environment for running this code (or any python project really). There are a few ways to do this, so if you'd like some setup help please reach out to the Vantiq AI team. The instructions below rely on make, but you can also just run main.py from your IDE if you prefer.

✅ Running locally

  1. Install dependencies: pip install -r requirements.txt
  2. Run ingest.sh to ingest LangChain docs data into the vectorstore (only needs to be done once).
    1. You can use other Document Loaders to load your own data into the vectorstore.
  3. Run the app: make start
    1. To enable tracing, make sure langchain-server is running locally and pass tracing=True to get_chain in main.py. You can find more documentation here.
  4. Open localhost:9000 in your browser.

🚀 Important Links

Deployed version (to be updated soon): chat.langchain.dev

Hugging Face Space (to be updated soon): huggingface.co/spaces/hwchase17/chat-langchain

Blog Posts:

📚 Technical description

There are two components: ingestion and question-answering.

Ingestion has the following steps:

  1. Pull html from documentation site
  2. Load html with LangChain's ReadTheDocs Loader
  3. Split documents with LangChain's TextSplitter
  4. Create a vectorstore of embeddings, using LangChain's vectorstore wrapper (with OpenAI's embeddings and FAISS vectorstore).

Question-Answering has the following steps, all handled by ChatVectorDBChain:

  1. Given the chat history and new user input, determine what a standalone question would be (using GPT-3).
  2. Given that standalone question, look up relevant documents from the vectorstore.
  3. Pass the standalone question and relevant documents to GPT-3 to generate a final answer.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 80.9%
  • HTML 17.3%
  • Shell 1.5%
  • Makefile 0.3%