Skip to content

Latest commit

 

History

History
59 lines (38 loc) · 9.95 KB

File metadata and controls

59 lines (38 loc) · 9.95 KB

Vector search in Python (Azure AI Search)

This repository contains multiple notebooks that demonstrate how to use Azure AI Search for vector and non-vector content in RAG patterns and in traditional search solutions.

Sample Description
basic-vector-workflow Start here. Basic vector indexing and queries using push model APIs. The code reads the data/text-sample.json file, which contains the input strings for which embeddings are generated. Output is a combination of human-readable text and embeddings that's pushed into a search index.
community-integration/hugging-face Hugging Face integration using the E5-small-V2 embedding model.
community-integration/langchain LangChain integration using the Azure AI Search vector store integration module.
community-integration/llamaindex LlamaIndex integration using the llama_index.vector_stores.azureaisearch.
community-integration/cohere Cohere Embed API integration using the Cohere Embed API.
custom-vectorizer Use an open source embedding model such as Hugging Face sentence-transformers all-MiniLM-L6-v2 to vectorize content and queries. This sample uses azd and bicep to deploy Azure resources for a fully operational solution. It uses a custom skill with a function app that calls an embedding model.
data-chunking Examples used in the in Chunk documents article on the documentation web site.
index-backup-restore Backup retrievable index fields and restore them on a new index on a different search service.
integrated-vectorization Demonstrates integrated data chunking and vectorization (preview) using skills to split text and call an Azure OpenAI embedding model.
multimodal Vectorize images using Azure AI Vision multimodal embedding. In contrast with the multimodal-custom-skill example, this notebook uses the push API (no indexers or skillsets) for indexing. It calls the embedding model directly for a pure image vector search.
multimodal-custom-skill End-to-end text-to-image sample that creates and calls a custom embedding model using a custom skill. Includes source code for an Azure function that calls the Azure AI Vision Image Retrieval REST API for text-to-image vectorization. Includes an azure-search-vector-image notebook for all steps, from deployment to queries.
vector-quantization-and-storage-options Sample showcasing how to use vector quantization and various other storage options to reduce storage usage by vector fields.

Prerequisites

To run the Python samples in this folder, you should have:

  • An Azure subscription, with access to Azure OpenAI.
  • Azure AI Search, any tier, but choose a service that can handle the workload. We strongly recommend Basic or higher.
  • Azure OpenAI is used in most samples. A deployment of the text-embedding-ada-002 is a common requirement.
  • Python (these instructions were tested with version 3.11.x)

You can use Visual Studio Code with the Python extension for these demos.

Set up your environment

  1. Clone this repository.

  2. Create a .env based on the code/.env-sample file. Copy your new .env file to the folder containing your notebook and update the variables.

  3. If you're using Visual Studio Code with the Python extension, make sure you also have the Jupyter extension.

Run the code

  1. Open the code folder and sample subfolder. Open a ipynb file in Visual Studio Code.

  2. Optionally, create a virtual environment so that you can control which package versions are used. Use Ctrl+shift+P to open a command palette. Search for Python: Create environment. Select Venv to create an environment within the current workspace.

  3. Copy the .env file to the subfolder containing the notebook.

  4. Execute the cells one by one, or select Run or Shift+Enter.

Troubleshoot errors

If you get error 429 from Azure OpenAI, it means the resource is over capacity:

  • Check the Activity Log of the Azure OpenAI service to see what else might be running.

  • Check the Tokens Per Minute (TPM) on the deployed model. On a system that isn't running other jobs, a TPM of 33K or higher should be sufficient to generate vectors for the sample data. You can try a model with more capacity if 429 errors persist.

  • Review these articles for information on rate limits: Understanding rate limits and A Guide to Azure OpenAI Service's Rate Limits and Monitoring.