Skip to content

geekcheng/chat-langchain

Repository files navigation

🦜️🔗 Chat LangChain

This repo is an implementation of a locally hosted chatbot specifically focused on question answering over the LangChain documentation. Built with LangChain and FastAPI.

The app leverages LangChain's streaming support and async API to update the page in real time for multiple users.

✅ Running locally

  1. Install dependencies: pip install -r requirements.txt
  2. Run python ingest.py to ingest LangChain docs data into the Weaviate vectorstore (only needs to be done once).
    1. You can use other Document Loaders to load your own data into the vectorstore.
  3. Run the app: make start for backend and npm run dev for frontend (cd into chat-langchain first)
    1. Make sure to enter your environment variables to configure the application:
    export OPENAI_API_KEY=
    export WEAVIATE_URL=
    export WEAVIATE_API_KEY=
    
    # for tracing
    export LANGCHAIN_TRACING_V2=true
    export LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
    export LANGCHAIN_API_KEY=
    export LANGCHAIN_PROJECT=
    
  4. Open localhost:3000 in your browser.

🚀 Important Links

Deployed version: chat.langchain.com

📚 Technical description

There are two components: ingestion and question-answering.

Ingestion has the following steps:

  1. Pull html from documentation site as well as the Github Codebase
  2. Load html with LangChain's RecursiveURLLoader Loader
  3. Transform html to text with Html2TextTransformer
  4. Split documents with LangChain's RecursiveCharacterTextSplitter
  5. Create a vectorstore of embeddings, using LangChain's Weaviate vectorstore wrapper (with OpenAI's embeddings).

Question-Answering has the following steps, all handled by OpenAIFunctionsAgent:

  1. Given the chat history and new user input, determine what a standalone question would be (using GPT-3.5).
  2. Given that standalone question, look up relevant documents from the vectorstore.
  3. Pass the standalone question and relevant documents to GPT-4 to generate and stream the final answer.
  4. Generate a trace URL for the current chat session, as well as the endpoint to collect feedback.

Deprecated Links

Hugging Face Space (to be updated soon): huggingface.co/spaces/hwchase17/chat-langchain

Blog Posts:

Releases

No releases published

Packages

No packages published

Languages

  • Python 47.7%
  • TypeScript 44.9%
  • HCL 6.7%
  • CSS 0.3%
  • Makefile 0.2%
  • JavaScript 0.1%
  • Procfile 0.1%