page_type | languages | name | products | description | urlFragment | ||||
---|---|---|---|---|---|---|---|---|---|
sample |
|
Vector search in Java |
|
Using azure-search-documents and Java, index and query vectors in a RAG pattern or a traditional search solution.
|
vector-search-java |
The Java demo in this repository creates vectorized data on Azure AI Search and runs a series of queries, sending output to the terminal window.
- Create an index schema
- Load the sample data
- Embed the documents in-memory
- Index the vector and nonvector fields
- Run a series of vector and hybrid queries
The sample data is a JSON file of 108 descriptions of various Azure services. The descriptions are short, which makes data chunking unnecessary.
The queries are articulated as strings. An Azure OpenAI embedding model converts the strings to vectors at run time. Queries include a pure vector query, vector with fitlers, hybrid query, hybrid with semantic ranking, and a multivector query.
For further exploration and chat interaction, connect to your index using Azure OpenAI Studio.
-
An Azure subscription, with access to Azure OpenAI. You must have the Azure OpenAI service name and an API key.
-
A deployment of the text-embedding-ada-002 embedding model.
-
Azure AI Search, any tier, but choose a service that has sufficient capacity for your vector index. We recommend Basic or higher.
-
A Java IDE. We used Visual Studio Code with the Java Extension for Visual Studio Code to test this sample.
-
A Java JDK. We used openjdk version 17.0.7 to test this sample.
-
Maven (installed locally, with %MAVEN_PATH% system variable assigned to the path). We used apache-maven-3.9.6 to test this sample.
-
Clone this repository or download the folder.
-
Rename
.env-sample
file in the /demo-java/demo-vectors/src/resources directory to.env
. -
Update the envrionment variables to point to your deployments.
-
In Visual Studio Code, open the demo-vectors folder.
-
Start a new terminal session, and type
java -version
andmvn -version
to confirm program availability. -
Run the following command:
mvn compile exec:java
A successful execution should have output similar to this example:
PS C:\test\azure-search-vector-samples\demo-java\demo-vectors> mvn compile exec:java [INFO] --- exec:3.1.1:java (default-cli) @ vectordemo --- Created index Embedding documents... Pausing after uploading documents... =================================== Single Vector Search from Embedding Results: =================================== Score: 0.829682, Title: Azure DevOps: Content: Azure DevOps is a suite of services that help you plan, build, and deploy applications. It includes Azure Boards for work item tracking, Azure Repos for source code management, Azure Pipelines for continuous integration and continuous deployment, Azure Test Plans for manual and automated testing, and Azure Artifacts for package management. DevOps supports a wide range of programming languages, frameworks, and platforms, making it easy to integrate with your existing development tools and processes. It also integrates with other Azure services, such as Azure App Service and Azure Functions. . . .
-
Sign in to the Azure portal to confirm you have an index on Azure AI Search.
- Sign in to Azure OpenAI Studio.
- On the left nav pane, under Playground, select Chat.
- In the chat playground, select Add your data, and then select Add a data source.
- Choose Azure AI Search.
- On the wizard's first page, select your search service and the Java demo index you just created.
- Select the Add vector search to this search resource and acknowledge the billing effect of using Azure AI Search.
- Select an embedding model on your Azure OpenAI resource and acknowledge the billing effect.
- Important. Select the Use custom field mapping checkbox. It adds a page for data field mapping. This step is important if you have multiple vector fields in the index.
- Verify that both vector fields are listed in the data field mapping page.
- On the wizard's next page, choose the query type and if using semantic ranking, acknowledge the billing effect. You might want to confirm semantic ranker is enabled if you aren't sure.
- On the wizard's last page, review and create.
- On the Configuration tab to the right, choose gpt-35-turbo for the deployment.
- Start your first chat with "How many Azure services store data" and continue from there. You can modify settings on the chat playground's data source or in the Configuration tab to change the query behavior.