forked from langchain-ai/langchain
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
more query analysis docs (langchain-ai#18358)
- Loading branch information
Showing
10 changed files
with
1,797 additions
and
29 deletions.
There are no files selected for viewing
190 changes: 190 additions & 0 deletions
190
docs/docs/use_cases/query_analysis/how_to/constructing-filters.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,190 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "raw", | ||
"id": "df7d42b9-58a6-434c-a2d7-0b61142f6d3e", | ||
"metadata": {}, | ||
"source": [ | ||
"---\n", | ||
"sidebar_position: 6\n", | ||
"---" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "f2195672-0cab-4967-ba8a-c6544635547d", | ||
"metadata": {}, | ||
"source": [ | ||
"# Construct Filters\n", | ||
"\n", | ||
"We may want to do query analysis to extract filters to pass into retrievers. One way we ask the LLM to represent these filters is as a Pydantic model. There is then the issue of converting that Pydantic model into a filter that can be passed into a retriever. \n", | ||
"\n", | ||
"This can be done manually, but LangChain also provides some \"Translators\" that are able to translate from a common syntax into filters specific to each retriever. Here, we will cover how to use those translators." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 13, | ||
"id": "8ca446a0", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from typing import Optional\n", | ||
"\n", | ||
"from langchain.chains.query_constructor.ir import (\n", | ||
" Comparator,\n", | ||
" Comparison,\n", | ||
" Operation,\n", | ||
" Operator,\n", | ||
" StructuredQuery,\n", | ||
")\n", | ||
"from langchain.retrievers.self_query.chroma import ChromaTranslator\n", | ||
"from langchain.retrievers.self_query.elasticsearch import ElasticsearchTranslator\n", | ||
"from langchain_core.pydantic_v1 import BaseModel" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "bc1302ff", | ||
"metadata": {}, | ||
"source": [ | ||
"In this example, `year` and `author` are both attributes to filter on." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 11, | ||
"id": "64055006", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"class Search(BaseModel):\n", | ||
" query: str\n", | ||
" start_year: Optional[int]\n", | ||
" author: Optional[str]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 12, | ||
"id": "44eb6d98", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"search_query = Search(query=\"RAG\", start_year=2022, author=\"LangChain\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 15, | ||
"id": "e8ba6705", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def construct_comparisons(query: Search):\n", | ||
" comparisons = []\n", | ||
" if query.start_year is not None:\n", | ||
" comparisons.append(\n", | ||
" Comparison(\n", | ||
" comparator=Comparator.GT,\n", | ||
" attribute=\"start_year\",\n", | ||
" value=query.start_year,\n", | ||
" )\n", | ||
" )\n", | ||
" if query.author is not None:\n", | ||
" comparisons.append(\n", | ||
" Comparison(\n", | ||
" comparator=Comparator.EQ,\n", | ||
" attribute=\"author\",\n", | ||
" value=query.author,\n", | ||
" )\n", | ||
" )\n", | ||
" return comparisons" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 16, | ||
"id": "6a79c9da", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"comparisons = construct_comparisons(search_query)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 17, | ||
"id": "2d0e9689", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"_filter = Operation(operator=Operator.AND, arguments=comparisons)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 18, | ||
"id": "e4c0b2ce", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"{'bool': {'must': [{'range': {'metadata.start_year': {'gt': 2022}}},\n", | ||
" {'term': {'metadata.author.keyword': 'LangChain'}}]}}" | ||
] | ||
}, | ||
"execution_count": 18, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"ElasticsearchTranslator().visit_operation(_filter)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 19, | ||
"id": "d75455ae", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"{'$and': [{'start_year': {'$gt': 2022}}, {'author': {'$eq': 'LangChain'}}]}" | ||
] | ||
}, | ||
"execution_count": 19, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"ChromaTranslator().visit_operation(_filter)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.1" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.