Skip to content

Commit

Permalink
Folder name, readme updates for Java samples
Browse files Browse the repository at this point in the history
  • Loading branch information
HeidiSteen committed Feb 28, 2024
1 parent 2f4eacd commit d7ede8a
Show file tree
Hide file tree
Showing 13 changed files with 122 additions and 25 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ Vector support consists of generally available features and preview features.

| Sample | Description | Status |
| ------ | ------------|--------|
| [integrated-vectorization](demo-java/integrated-vectorization/readme.md) | Demonstrates the integrated vectorization capabilities currently in preview, but also includes steps for indexing and querying vectors on an Azure AI Search service. | GA and preview |
| [demo-vectors](demo-java/demo-vectors/readme.md) | Basic workflow of vector indexing and querying on an Azure AI Search service. | GA |
| [demo-integrated-vectorization](demo-java/demo-integrated-vectorization/readme.md) | Demonstrates the integrated vectorization capabilities currently in preview, but also includes steps for indexing and querying vectors on an Azure AI Search service. | GA and preview |

## demo-javascript samples

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@
<modelVersion>4.0.0</modelVersion>

<groupId>azure.search.sample</groupId>
<artifactId>vectorsearchjavademo</artifactId>
<artifactId>integratedvectors</artifactId>
<version>1.0-SNAPSHOT</version>

<name>vectorsearchjavademo</name>
<name>integratedvectors</name>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,24 +1,26 @@
---
page_type: sample
languages:
- javascript
name: Vector search in Java
- javas
name: Integrated vectorization (Java)
products:
- azure
- azure-cognitive-search
- azure-openai
description: |
Using azure-search-documents and Java, index and query vectors in a RAG pattern or a traditional search solution.
urlFragment: vector-search-java
Using azure-search-documents and Java, apply data chunking and vectorization in an indexer pipeline..
urlFragment: integrated-vectors-java
---

# Vector search using Java (Azure AI Search)
# Integrated vectorization using Java (Azure AI Search)

The Java demo in this repository creates vectorized data on Azure AI Search. We recommend querying your vector data with Azure OpenAI Studio after your content is indexed.
This Java sample adds [integrated data chunking and vectorization](https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization) to an indexing pipeline on Azure AI Search. We recommend querying your vector data with Azure OpenAI Studio after your content is indexed.

| Samples | Description |
|---------|-------------|
| **integrated-vectorization** | [Integrated Vectorization sample](#run-the-integrated-vectorization-sample-program). It uses **azure-search-documents** in the Azure SDK for Java. It sets up [integrated vectorization](https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization) on a [blob container](https://learn.microsoft.com/zure/search/search-blob-storage-integration). |
+ Create an index schema, data source, skillset, and indexer
+ Load the sample data from Blob storage
+ Chunk the documents using the TextSplit skill
+ Embed the chunks using the AzureOpenAIEmbedding skill
+ Index the vector and nonvector fields

## Prerequisites

Expand All @@ -40,7 +42,7 @@ The Java demo in this repository creates vectorized data on Azure AI Search. We

1. Clone this repository or download the folder.

1. Rename `.env-sample` file in the /demo-java/integrated-vectorization/src/resources directory to `.env`.
1. Rename `.env-sample` file in the /demo-java/demo-integrated-vectorization/src/resources directory to `.env`.

1. Update the envrionment variables to point to your deployments.

Expand All @@ -50,14 +52,12 @@ The Java demo in this repository creates vectorized data on Azure AI Search. We

1. Start a new terminal session, and type `java -version` and `mvn -version` to confirm program availability.

![Screenshot of console showin test output.](../docs/media/java-sample-test-versions.png)

1. Run the following command: `mvn compile exec:java`

A successful execution should have output similar to this example:

```bash
PS C:\test\azure-search-vector-samples\demo-java\integrated-vectorization> mvn compile exec:java
PS C:\test\azure-search-vector-samples\demo-java\demo-integrated-vectorization> mvn compile exec:java
[INFO] --- exec:3.1.1:java (default-cli) @ vectorsearchjavademo ---
Created index
Created datasource
Expand All @@ -78,13 +78,14 @@ The Java demo in this repository creates vectorized data on Azure AI Search. We
1. [Sign in to Azure OpenAI Studio](https://oai.azure.com/portal/).
1. On the left nav pane, under **Playground**, select **Chat**.
1. In the chat playground, select **Add your data**, and then select **Add a data source**.
1. Choose Azure AI Search.
1. Choose **Azure AI Search**.
1. On the wizard's first page, select your search service and the Java demo index you just created.
1. Select the **Add vector search to this search resource** and acknowledge the billing effect of using Azure AI Search.
1. Select an embedding model on your Azure OpenAI resource and acknowledge the billing effect.
1. Skip the vector field mapping step. The sample index only has one vector field. The playground detects and uses it automatically.
1. On the wizard's next page, choose the query type and if using semantic ranking, acknowledge the billing effect. You might want to confirm [semantic ranker is enabled](https://learn.microsoft.com/azure/search/semantic-how-to-enable-disable) if you aren't sure.
1. On the wizard's last page, review and create.
1. On the **Configuration** tab to the right, choose **gpt-35-turbo** for the deployment.
1. Start your first chat with "what are natural sources of light at night?" and continue from there. You can modify settings on the chat playground's data source or in the **Configuration** tab to change the query behavior.
:::image type="content" source="media/playground-chat.png" alt-text="Screeshot of the chat playground.":::
![Screenshot of the chat playground](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-java/media/playground-chat.png?raw=true)
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
AZURE_SEARCH_ENDPOINT=https://your-search-service.search.windows.net
AZURE_SEARCH_INDEX=java-sample
AZURE_SEARCH_DATASOURCE=java-sample-ds
AZURE_SEARCH_SKILLSET=java-sample-skillset
AZURE_SEARCH_INDEXER=java-sample-indexer
AZURE_SEARCH_INDEX=java-integrate-vector
AZURE_SEARCH_DATASOURCE=java-integrate-vector-ds
AZURE_SEARCH_SKILLSET=java-integrate-vector-skillset
AZURE_SEARCH_INDEXER=java-integrate-vector-indexer
# Optional, not required if using RBAC authentication
AZURE_SEARCH_ADMIN_KEY=

Expand Down
4 changes: 2 additions & 2 deletions demo-java/sample/pom.xml → demo-java/demo-vectors/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@
<modelVersion>4.0.0</modelVersion>

<groupId>azure.search.sample</groupId>
<artifactId>azuresearchquickstart</artifactId>
<artifactId>vectordemo</artifactId>
<version>1.0-SNAPSHOT</version>

<name>azuresearchquickstart</name>
<name>vectordemo</name>

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
Expand Down
95 changes: 95 additions & 0 deletions demo-java/demo-vectors/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
page_type: sample
languages:
- javas
name: Vector search in Java
products:
- azure
- azure-cognitive-search
- azure-openai
description: |
Using azure-search-documents and Java, index and query vectors in a RAG pattern or a traditional search solution.
urlFragment: vector-search-java
---

# Basic vector demo using Java (Azure AI Search)

The Java demo in this repository creates vectorized data on Azure AI Search and runs a series of queries, sending output to the terminal window.

+ Create an index schema
+ Load the sample data
+ Embed the documents in-memory
+ Index the vector and nonvector fields
+ Run a series of vector and hybrid queries

The sample data is a JSON file of 108 descriptions of various Azure services. The descriptions are short, which makes data chunking unnecessary.

The queries are articulated as strings. An Azure OpenAI embedding model converts the strings to vectors at run time. Queries include a pure vector query, vector with fitlers, hybrid query, hybrid with semantic ranking, and a multivector query.

For further exploration and chat interaction, connect to your index using Azure OpenAI Studio.

## Prerequisites

+ An Azure subscription, with [access to Azure OpenAI](https://aka.ms/oai/access). You must have the Azure OpenAI service name and an API key.

+ A deployment of the **text-embedding-ada-002** embedding model.

+ Azure AI Search, any tier, but choose a service that has sufficient capacity for your vector index. We recommend Basic or higher.

+ A Java IDE. We used [Visual Studio Code](https://code.visualstudio.com/download) with the [Java Extension for Visual Studio Code](https://marketplace.visualstudio.com/items?itemName=vscjava.vscode-java-pack) to test this sample.

+ A Java JDK. We used [openjdk version 17.0.7](https://learn.microsoft.com/java/openjdk/download) to test this sample.

+ Maven (installed locally, with %MAVEN_PATH% system variable assigned to the path). We used [apache-maven-3.9.6](https://maven.apache.org/download.cgi) to test this sample.

## Set up your environment

1. Clone this repository or download the folder.

1. Rename `.env-sample` file in the /demo-java/demo-vectors/src/resources directory to `.env`.

1. Update the envrionment variables to point to your deployments.

## Run the vector demo sample program

1. In Visual Studio Code, open the **demo-vectors** folder.

1. Start a new terminal session, and type `java -version` and `mvn -version` to confirm program availability.

1. Run the following command: `mvn compile exec:java`

A successful execution should have output similar to this example:

```bash
PS C:\test\azure-search-vector-samples\demo-java\demo-vectors> mvn compile exec:java
[INFO] --- exec:3.1.1:java (default-cli) @ vectordemo ---
Created index
Embedding documents...
Pausing after uploading documents...
===================================
Single Vector Search from Embedding Results:
===================================
Score: 0.829682, Title: Azure DevOps: Content: Azure DevOps is a suite of services that help you plan, build, and deploy applications. It includes Azure Boards for work item tracking, Azure Repos for source code management, Azure Pipelines for continuous integration and continuous deployment, Azure Test Plans for manual and automated testing, and Azure Artifacts for package management. DevOps supports a wide range of programming languages, frameworks, and platforms, making it easy to integrate with your existing development tools and processes. It also integrates with other Azure services, such as Azure App Service and Azure Functions.

. . .
```

1. [Sign in to the Azure portal](https://portal.azure.com) to confirm you have an index on Azure AI Search.

## Query your index in Azure OpenAI Studio

1. [Sign in to Azure OpenAI Studio](https://oai.azure.com/portal/).
1. On the left nav pane, under **Playground**, select **Chat**.
1. In the chat playground, select **Add your data**, and then select **Add a data source**.
1. Choose **Azure AI Search**.
1. On the wizard's first page, select your search service and the Java demo index you just created.
1. Select the **Add vector search to this search resource** and acknowledge the billing effect of using Azure AI Search.
1. Select an embedding model on your Azure OpenAI resource and acknowledge the billing effect.
1. **Important**. Select the **Use custom field mapping** checkbox. It adds a page for data field mapping. This step is important if you have multiple vector fields in the index.
1. Verify that both vector fields are listed in the data field mapping page.
1. On the wizard's next page, choose the query type and if using semantic ranking, acknowledge the billing effect. You might want to confirm [semantic ranker is enabled](https://learn.microsoft.com/azure/search/semantic-how-to-enable-disable) if you aren't sure.
1. On the wizard's last page, review and create.
1. On the **Configuration** tab to the right, choose **gpt-35-turbo** for the deployment.
1. Start your first chat with "How many Azure services store data" and continue from there. You can modify settings on the chat playground's data source or in the **Configuration** tab to change the query behavior.
![Screenshot of the chat playground](https://github.com/Azure/azure-search-vector-samples/blob/main/demo-java/media/playground-chat-azure-services.png?raw=true)
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
AZURE_SEARCH_ENDPOINT=https://your-search-service.search.windows.net
AZURE_SEARCH_INDEX=java-sample-openai
AZURE_SEARCH_INDEX=java-vector-idx
# Optional, not required if using RBAC authentication
AZURE_SEARCH_ADMIN_KEY=

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit d7ede8a

Please sign in to comment.