Skip to content

Commit

Permalink
BREAKING CHANGE - Change vector store initialize-schema to false
Browse files Browse the repository at this point in the history
* Change default schema initialization of vector stores from `true` to `false.`
  Users need to explicitly opt-in for schema initialization by setting the
  `initialize-schema` property on the corresponding vector store.
* Update integration tests
* Update docs

Fixes spring-projects#907
  • Loading branch information
sobychacko authored and markpollack committed Jul 18, 2024
1 parent edf943e commit 50d34b8
Show file tree
Hide file tree
Showing 57 changed files with 329 additions and 167 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,24 @@ Add these dependencies to your project:

TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.

== Configuration Properties

You can use the following properties in your Spring Boot configuration to customize the Apache Cassandra vector store.

|===
|Property|Default value

|`spring.ai.vectorstore.cassandra.keyspace`|springframework
|`spring.ai.vectorstore.cassandra.table`|ai_vector_store
|`spring.ai.vectorstore.cassandra.initialze-schema`|false
|`spring.ai.vectorstore.cassandra.index-name`|
|`spring.ai.vectorstore.cassandra.content-column-name`|content
|`spring.ai.vectorstore.cassandra.embedding-column-name`|embedding
|`spring.ai.vectorstore.cassandra.return-embeddings`|false
|`spring.ai.vectorstore.cassandra.fixed-thread-pool-executor-size`|16
|===



== Usage

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ link:https://azure.microsoft.com/en-us/products/ai-services/ai-search/[Azure AI

== Configuration

On startup, the `AzureVectorStore` can attempt to create a new index within your AI Search service instance if you've opted in by setting the relevant `initializeSchema` `boolean` property to `true` in the constructor or, if using Spring Boot, setting `...initialize-schema=true` in your `application.properties` file.
On startup, the `AzureVectorStore` can attempt to create a new index within your AI Search service instance if you've opted in by setting the relevant `initialize-schema` `boolean` property to `true` in the constructor or, if using Spring Boot, setting `...initialize-schema=true` in your `application.properties` file.


NOTE: this is a breaking change! In earlier versions of Spring AI, this schema initialization happened by default.
Expand Down Expand Up @@ -88,6 +88,24 @@ Add these dependencies to your project:

TIP: Refer to the xref:getting-started.adoc#dependency-management[Dependency Management] section to add the Spring AI BOM to your build file.

== Configuration Properties

You can use the following properties in your Spring Boot configuration to customize the Azure vector store.

|===
|Property|Default value

|`spring.ai.vectorstore.azure.url`|
|`spring.ai.vectorstore.azure.api-key`|
|`spring.ai.vectorstore.azure.initialze-schema`|false
|`spring.ai.vectorstore.azure.index-name`|spring_ai_azure_vector_store
|`spring.ai.vectorstore.azure.default-top-k`|4
|`spring.ai.vectorstore.azure.default-similarity-threshold`|0.0
|`spring.ai.vectorstore.azure.embedding-property`|embedding
|`spring.ai.vectorstore.azure.index-name`|spring-ai-document-index
|===


== Sample Code

To configure an Azure `SearchIndexClient` in your application, you can use the following code:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,14 +66,14 @@ A simple configuration can either be provided via Spring Boot's _application.pro
[source,properties]
----
# Chroma Vector Store connection properties
spring.ai.vectorstore.chroma.client.initialize-schema=<true or false>
spring.ai.vectorstore.chroma.client.host=<your Chroma instance host>
spring.ai.vectorstore.chroma.client.port=<your Chroma instance port>
spring.ai.vectorstore.chroma.client.key-token=<your access token (if configure)>
spring.ai.vectorstore.chroma.client.username=<your username (if configure)>
spring.ai.vectorstore.chroma.client.password=<your password (if configure)>
# Chroma Vector Store collection properties
spring.ai.vectorstore.chroma.initialize-schema=<true or false>
spring.ai.vectorstore.chroma.collection-name=<your collection name>
# Chroma Vector Store configuration properties
Expand Down Expand Up @@ -117,6 +117,7 @@ You can use the following properties in your Spring Boot configuration to custom
|`spring.ai.vectorstore.chroma.client.username`| Access username (if configured) | -
|`spring.ai.vectorstore.chroma.client.password`| Access password (if configured) | -
|`spring.ai.vectorstore.chroma.collection-name`| Collection name | `SpringAiCollection`
|`spring.ai.vectorstore.chroma.initialize-schema`| Whether to initialize the required schema | `false`
|===

[NOTE]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ Properties starting with the `spring.ai.vectorstore.elasticsearch.*` prefix are
|===
|Property | Description | Default Value

|`spring.ai.vectorstore.elasticsearch.initialize-schema`| Whether to initialize the required schema | `false`
|`spring.ai.vectorstore.elasticsearch.index-name` | The name of the index to store the vectors. | spring-ai-document-index
|`spring.ai.vectorstore.elasticsearch.dimensions` | The number of dimensions in the vector. | 1536
|`spring.ai.vectorstore.elasticsearch.similarity` | The similarity function to use. | `cosine`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ You can use the following properties in your Spring Boot configuration to furthe

|`spring.ai.vectorstore.gemfire.host`|localhost
|`spring.ai.vectorstore.gemfire.port`|8080
|`spring.ai.vectorstore.gemfire.initialize-schema`| `false`
|`spring.ai.vectorstore.gemfire.index-name`|spring-ai-gemfire-store
|`spring.ai.vectorstore.gemfire.beam-width`|100
|`spring.ai.vectorstore.gemfire.max-connections`|16
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -120,11 +120,11 @@ You can use the following properties in your Spring Boot configuration to custom
|===
|Property| Description | Default value

|`spring.opensearch.uris`| URIs of the OpenSearch cluster endpoints. | -
|`spring.opensearch.username`| Username for accessing the OpenSearch cluster. | -
|`spring.opensearch.password`| Password for the specified username. | -
|`spring.opensearch.indexName`| Name of the default index to be used within the OpenSearch cluster. | `spring-ai-document-index`
|`spring.opensearch.mappingJson`| JSON string defining the mapping for the index; specifies how documents and their
|`spring.ai.vectorstore.opensearch.uris`| URIs of the OpenSearch cluster endpoints. | -
|`spring.ai.vectorstore.opensearch.username`| Username for accessing the OpenSearch cluster. | -
|`spring.ai.vectorstore.opensearch.password`| Password for the specified username. | -
|`spring.ai.vectorstore.opensearch.indexName`| Name of the default index to be used within the OpenSearch cluster. | `spring-ai-document-index`
|`spring.ai.vectorstore.opensearch.mappingJson`| JSON string defining the mapping for the index; specifies how documents and their
fields are stored and indexed. |
{
"properties":{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ You can use the following properties in your Spring Boot configuration to custom

|`spring.ai.vectorstore.redis.uri`| Server connection URI | `redis://localhost:6379`
|`spring.ai.vectorstore.redis.index`| Index name | `default-index`
|`spring.ai.vectorstore.redis.initialize-schema`| whether to initialize the required schema | `false`
|`spring.ai.vectorstore.redis.initialize-schema`| Whether to initialize the required schema | `false`
|`spring.ai.vectorstore.redis.prefix`| Prefix | `default:`

|===
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,9 @@ You can use the following properties in your Spring Boot configuration to custom
|`spring.ai.vectorstore.typesense.client.host`| Hostname | `localhost`
|`spring.ai.vectorstore.typesense.client.port`| Port | `8108`
|`spring.ai.vectorstore.typesense.client.apiKey`| ApiKey | `xyz`
|`spring.ai.vectorstore.typesense.collectionName`| Collection Name | `vector_store`
|`spring.ai.vectorstore.typesense.embeddingDimension`| Embedding Dimension | `1536`
|`spring.ai.vectorstore.typesense.initialize-schema`| Whether to initialize the required schema | `false`
|`spring.ai.vectorstore.typesense.collection-name`| Collection Name | `vector_store`
|`spring.ai.vectorstore.typesense.embedding-dimension`| Embedding Dimension | `1536`

|===

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@

* The configuration prefix for the Chroma Vector Store has been changes from `spring.ai.vectorstore.chroma.store` to `spring.ai.vectorstore.chroma` in order to align with the naming conventions of other vector stores.

* The default value of the `initialize-schema` property on vector stores capable of initializing a schema is now set to `false`.
This implies that the applications now need to explicitly opt-in for schema initialization on supported vector stores, if the schema is expected to be created at application startup.
Not all vector stores support this property.
See the corresponding vector store documentation for more details.
The following are the vector stores that currently don't support the `initialize-schema` property.

1. Hana
2. Pinecone
3. Weaviate

== Upgrading to 1.0.0.M1

On our march to release 1.0.0 M1 we have made several breaking changes. Apologies, it is for the best!
Expand Down
Original file line number Diff line number Diff line change
@@ -1,11 +1,33 @@
/*
* Copyright 2023-2024 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.springframework.ai.autoconfigure.vectorstore;

/**
* @author Josh Long
* @author Soby Chacko
*/
public class CommonVectorStoreProperties {

private boolean initializeSchema = true;
/**
* Vector stores do not initialize schema by default on application startup. The
* applications explicitly need to opt-in for initializing the schema on startup. The
* recommended way to initialize the schema on startup is to set the initialize-schema
* property on the vector store. See {@link #setInitializeSchema(boolean)}.
*/
private boolean initializeSchema = false;

public boolean isInitializeSchema() {
return initializeSchema;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ public GemFireVectorStore gemfireVectorStore(EmbeddingModel embeddingModel, GemF
.setVectorSimilarityFunction(properties.getVectorSimilarityFunction())
.setFields(properties.getFields())
.setSslEnabled(properties.isSslEnabled());
return new GemFireVectorStore(config, embeddingModel);
return new GemFireVectorStore(config, embeddingModel, properties.isInitializeSchema());
}

private static class PropertiesGemFireConnectionDetails implements GemFireConnectionDetails {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,15 @@

package org.springframework.ai.autoconfigure.vectorstore.gemfire;

import org.springframework.ai.autoconfigure.vectorstore.CommonVectorStoreProperties;
import org.springframework.ai.vectorstore.GemFireVectorStoreConfig;
import org.springframework.boot.context.properties.ConfigurationProperties;

/**
* @author Geet Rawat
*/
@ConfigurationProperties(GemFireVectorStoreProperties.CONFIG_PREFIX)
public class GemFireVectorStoreProperties {
public class GemFireVectorStoreProperties extends CommonVectorStoreProperties {

/**
* Configuration prefix for Spring AI VectorStore GemFire.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,8 @@ OpenSearchVectorStore vectorStore(OpenSearchVectorStoreProperties properties, Op
var indexName = Optional.ofNullable(properties.getIndexName()).orElse(OpenSearchVectorStore.DEFAULT_INDEX_NAME);
var mappingJson = Optional.ofNullable(properties.getMappingJson())
.orElse(OpenSearchVectorStore.DEFAULT_MAPPING_EMBEDDING_TYPE_KNN_VECTOR_DIMENSION_1536);
return new OpenSearchVectorStore(indexName, openSearchClient, embeddingModel, mappingJson);
return new OpenSearchVectorStore(indexName, openSearchClient, embeddingModel, mappingJson,
properties.isInitializeSchema());
}

@Configuration(proxyBeanMethods = false)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,13 @@
*/
package org.springframework.ai.autoconfigure.vectorstore.opensearch;

import org.springframework.ai.autoconfigure.vectorstore.CommonVectorStoreProperties;
import org.springframework.boot.context.properties.ConfigurationProperties;

import java.util.List;

@ConfigurationProperties(prefix = OpenSearchVectorStoreProperties.CONFIG_PREFIX)
public class OpenSearchVectorStoreProperties {
public class OpenSearchVectorStoreProperties extends CommonVectorStoreProperties {

public static final String CONFIG_PREFIX = "spring.ai.vectorstore.opensearch";

Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
/*
* Copyright 2023-2024 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.ai.autoconfigure.vectorstore.typesense;

import org.springframework.boot.context.properties.ConfigurationProperties;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ public TypesenseVectorStore vectorStore(Client typesenseClient, EmbeddingModel e
.withEmbeddingDimension(properties.getEmbeddingDimension())
.build();

return new TypesenseVectorStore(typesenseClient, embeddingModel, config);
return new TypesenseVectorStore(typesenseClient, embeddingModel, config, properties.isInitializeSchema());
}

@Bean
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,31 @@
/*
* Copyright 2023-2024 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.springframework.ai.autoconfigure.vectorstore.typesense;

import org.springframework.ai.autoconfigure.vectorstore.CommonVectorStoreProperties;
import org.springframework.ai.vectorstore.TypesenseVectorStore;
import org.springframework.boot.context.properties.ConfigurationProperties;

/**
* @author Pablo Sanchidrian Herrera
* @author Soby Chacko
*/
@ConfigurationProperties(TypesenseVectorStoreProperties.CONFIG_PREFIX)
public class TypesenseVectorStoreProperties {
public class TypesenseVectorStoreProperties extends CommonVectorStoreProperties {

public static final String CONFIG_PREFIX = "spring.ai.vectorstore.typesense";

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2023 - 2024 the original author or authors.
* Copyright 2023-2024 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -32,6 +32,7 @@
/**
* @author Christian Tzolov
* @author Eddú Meléndez
* @author Soby Chacko
*/
@AutoConfiguration
@ConditionalOnClass({ EmbeddingModel.class, WeaviateVectorStore.class })
Expand Down Expand Up @@ -72,8 +73,7 @@ public WeaviateVectorStore vectorStore(EmbeddingModel embeddingModel, WeaviateCl
.toList())
.withConsistencyLevel(properties.getConsistencyLevel());

return new WeaviateVectorStore(configBuilder.build(), embeddingModel, weaviateClient,
properties.isInitializeSchema());
return new WeaviateVectorStore(configBuilder.build(), embeddingModel, weaviateClient);
}

static class PropertiesWeaviateConnectionDetails implements WeaviateConnectionDetails {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
* @author Christian Tzolov
*/
@ConfigurationProperties(WeaviateVectorStoreProperties.CONFIG_PREFIX)
public class WeaviateVectorStoreProperties extends CommonVectorStoreProperties {
public class WeaviateVectorStoreProperties {

public static final String CONFIG_PREFIX = "spring.ai.vectorstore.weaviate";

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright 2023 - 2024 the original author or authors.
* Copyright 2023-2024 the original author or authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -44,6 +44,7 @@

/**
* @author Christian Tzolov
* @author Soby Chacko
*/
@EnabledIfEnvironmentVariable(named = "AZURE_AI_SEARCH_API_KEY", matches = ".+")
@EnabledIfEnvironmentVariable(named = "AZURE_AI_SEARCH_ENDPOINT", matches = ".+")
Expand All @@ -68,7 +69,8 @@ public static String getText(String uri) {
.withConfiguration(AutoConfigurations.of(AzureVectorStoreAutoConfiguration.class))
.withUserConfiguration(Config.class)
.withPropertyValues("spring.ai.vectorstore.azure.apiKey=" + System.getenv("AZURE_AI_SEARCH_API_KEY"),
"spring.ai.vectorstore.azure.url=" + System.getenv("AZURE_AI_SEARCH_ENDPOINT"));
"spring.ai.vectorstore.azure.url=" + System.getenv("AZURE_AI_SEARCH_ENDPOINT"))
.withPropertyValues("spring.ai.vectorstore.azure.initialize-schema=true");

@BeforeAll
public static void beforeAll() {
Expand All @@ -81,8 +83,8 @@ public static void beforeAll() {
public void addAndSearchTest() {

contextRunner
.withPropertyValues("spring.ai.vectorstore.azure.indexName=my_test_index",
"spring.ai.vectorstore.azure.defaultTopK=6",
.withPropertyValues("spring.ai.vectorstore.azure.initializeSchema=true",
"spring.ai.vectorstore.azure.indexName=my_test_index", "spring.ai.vectorstore.azure.defaultTopK=6",
"spring.ai.vectorstore.azure.defaultSimilarityThreshold=0.75")
.run(context -> {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ class CassandraVectorStoreAutoConfigurationIT {
.withConfiguration(
AutoConfigurations.of(CassandraVectorStoreAutoConfiguration.class, CassandraAutoConfiguration.class))
.withUserConfiguration(Config.class)
.withPropertyValues("spring.ai.vectorstore.cassandra.initialize-schema=true")
.withPropertyValues("spring.ai.vectorstore.cassandra.keyspace=test_autoconfigure")
.withPropertyValues("spring.ai.vectorstore.cassandra.contentColumnName=doc_chunk");

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ void defaultValues() {
assertThat(props.getContentColumnName()).isEqualTo(CassandraVectorStoreConfig.DEFAULT_CONTENT_COLUMN_NAME);
assertThat(props.getEmbeddingColumnName()).isEqualTo(CassandraVectorStoreConfig.DEFAULT_EMBEDDING_COLUMN_NAME);
assertThat(props.getIndexName()).isNull();
assertThat(props.getDisallowSchemaCreation()).isFalse();
assertThat(props.getDisallowSchemaCreation()).isTrue();
assertThat(props.getFixedThreadPoolExecutorSize())
.isEqualTo(CassandraVectorStoreConfig.DEFAULT_ADD_CONCURRENCY);
}
Expand Down
Loading

0 comments on commit 50d34b8

Please sign in to comment.