Skip to content

Commit

Permalink
Azure Cosmos [DB] (brianfrankcooper#1264)
Browse files Browse the repository at this point in the history
* Adding Azure Cosmos Driver.
Still some improvements to make like automatically creating the YCSB database and usertable collection. But this does bring in the all the latest SDKs / etc.
  • Loading branch information
voellm authored and stfeng2 committed Nov 27, 2018
1 parent e6bd739 commit 2f03a2f
Show file tree
Hide file tree
Showing 10 changed files with 779 additions and 0 deletions.
120 changes: 120 additions & 0 deletions azurecosmos/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
<!--
Copyright (c) 2018 YCSB contributors.
All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You
may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the specific language governing
permissions and limitations under the License. See accompanying
LICENSE file.
-->

## Azure Cosmos Quick Start

This section describes how to run YCSB on Azure Cosmos.

For more information on Azure Cosmos see
https://azure.microsoft.com/services/cosmos-db/

### 1. Setup
This benchmark expects you to have pre-created the database "ycsb" and
collection "usertable" before running the benchmark commands. You can
override the default database name with the azurecosmos.databaseName
configuration value.

You must set the uri and the primaryKey in the azurecosmos.properties file in the commands below.
$YCSB_HOME/bin/ycsb load azurecosmos -P workloads/workloada -P azurecosmos/conf/azurecosmos.properties
$YCSB_HOME/bin/ycsb run azurecosmos -P workloads/workloada -P azurecosmos/conf/azurecosmos.properties

Optionally you can set the uri and primaryKey as follows:
$YCSB_HOME/bin/ycsb load azurecosmos -P workloads/workloada -p azurecosmos.primaryKey=<key from the portal> -p azurecosmos.uri=<uri from the portal>

### 2. DocumenDB Configuration Parameters

#### Required parameters

- azurecosmos.uri < uri string > :
- Obtained from the portal and gives a path to your azurecosmos database
account. It will look like the following:
https://<your account name>.documents.azure.com:443/

- azurecosmos.primaryKey < key string > :
- Obtained from the portal and is the key to use for benchmarking. The
primary key is used to allow both read & write operations. If you are
doing read only workloads you can substitute the readonly key from the
portal.

#### Options parameters

- azurecosmos.databaseName < name string > :
- Name of the database to use.
- Default: ycsb

- azurecosmos.useSinglePartitionCollection (true | false):
- It should be true if you created the collection with a single parition. If
you created the collection with a partitioning key this value should be set
to false.
- Default: true

- azurecosmos.useUpsert (true | false):
- Set to true to allow inserts to update existing documents. If this is
false and a document already exists the insert will fail.
- Default: false

- azurecosmos.connectionMode (DirectHttps | Gateway):
- Some java operations only work when connecting via the gateway. However
the best performance for basic operations like those used by YCSB are
obtained by using direct more where the client will connect directly to the
master server thats is managing the database and collection.
- Default: DirectHttps

- azurecosmos.consistencyLevel (Strong | BoundedStaleness | Session | Eventual):
- This setting defined the level on consistency you want for reads/scans
following inserts/updates.
- Default: Session

- azurecosmos.maxRetryAttemptsOnThrottledRequests < integer >
- Sets the maximum number of retry attempts for throttled requests
- Default: uses default value of azurecosmos Java SDK

- azurecosmos.maxRetryWaitTimeInSeconds < integer >
- Sets the maximum timeout to for retry in seconds
- Default: uses default value of azurecosmos Java SDK

- azurecosmos.useHashQueryForScan (true | false):
- This setting indicates whether SCAN operation should use hash query instead of range query.
Range query: SELECT * FROM root r WHERE r.id = @startkey
Hash query: SELECT TOP @recordcount * FROM root r WHERE r.id >= @startkey
- Default: false

- azurecosmos.maxDegreeOfParallelismForQuery < integer >
- Sets the maximum degree of parallelism for the FeedOptions used in Query operation
- Default: 0

- azurecosmos.includeExceptionStackInLog (true | false):
- Determines if the full stack when and error happens should be included in the log.
The default is false to reduce a lot of log spew.

- azurecosmos.maxConnectionPoolSize < integer >
- This is the number of connections maintained for operations.
- See the JAVA SDK documentation for ConnectionPolicy.getMaxPoolSize

- azurecosmos.idleConnectionTimeout < integer >
- This value is in seconds and determines how quickly a connection is recycled.
- See the JAVA SDK documentation for ConnectionPolicy.setIdleConnectionTimeout.

These parameters are also defined in a template configuration file in the
following location:
$YCSB_HOME/azurecosmos/conf/azurecosmos.properties

### 3. FAQs

### 4. Example command
./bin/ycsb run azurecosmos -s -P workloads/workloadb -p azurecosmos.primaryKey=<your key eg:45fgt...==> -p azurecosmos.uri=https://<your account>.documents.azure.com:443/ -p recordcount=100 -p operationcount=100
56 changes: 56 additions & 0 deletions azurecosmos/conf/azurecosmos.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Copyright (c) 2018 YCSB contributors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you
# may not use this file except in compliance with the License. You
# may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied. See the License for the specific language governing
# permissions and limitations under the License. See accompanying
# LICENSE file.

# Azure Cosmos host uri (ex: https://p3rf.documents.azure.com:443/) and primary key
#azurecosmos.primaryKey =
#azurecosmos.uri =

# Databse to be used, if not specified 'ycsb' will be used
#azurecosmos.databaseName = ycsb

# Enable/disable the use of single collection, if not specified a single collection will be used by default
# "true" or "false"
#azurecosmos.useSinglePartitionCollection = true

# Specify if upsert should be used instead of createDocument
# If not specified, createDocument will be used by default
# "true" or "false"
#azurecosmos.useUpsert = false

# Specify if connection policy should use gateway or not
# If not specified, direct connectivity with better performance will be used by default
# Value can be DirectHttps or Gateway.
#azurecosmos.connectionMode = DirectHttps

# Specify consistency level, values can be Strong, BoundedStaleness, Session or Eventual
# If not specified, Session will be used by default
azurecosmos.consistencyLevel = Session

# Specify retry options to use in case of throttled request.
# If not specified, default values will be used
#azurecosmos.maxRetryAttemptsOnThrottledRequests = 9
#azurecosmos.maxRetryWaitTimeInSeconds = 30

# Specify if hash query should be used in SCAN operation instead of range query.
# If not specified, range query will be used by default.
#azurecosmos.useHashQueryForScan = true

# Specify if the 'id' property should be used in SCAN operation.
# If not specified, the 'docid' property will be used by default.
#azurecosmos.useIdPropertyForScan = true

# Specify the maximum degree of parallelism for the FeedOptions used in Query operation.
# If not specified it will take 0 as the default value.
#azurecosmos.maxDegreeOfParallelismForQuery = 0
88 changes: 88 additions & 0 deletions azurecosmos/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright (c) 2018 YCSB contributors. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you
may not use this file except in compliance with the License. You
may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the License for the specific language governing
permissions and limitations under the License. See accompanying
LICENSE file.
-->

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>binding-parent</artifactId>
<version>0.16.0-SNAPSHOT</version>
<relativePath>../binding-parent</relativePath>
</parent>

<artifactId>azurecosmos-binding</artifactId>
<name>Azure Cosmos Binding</name>
<packaging>jar</packaging>

<properties>
<checkstyle.failOnViolation>false</checkstyle.failOnViolation>
</properties>

<dependencies>
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-documentdb</artifactId>
<version>${azurecosmos.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.5</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.5</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>com.yahoo.ycsb</groupId>
<artifactId>core</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-checkstyle-plugin</artifactId>
<version>2.15</version>
<configuration>
<consoleOutput>true</consoleOutput>
<configLocation>../checkstyle.xml</configLocation>
<failOnViolation>true</failOnViolation>
<failsOnError>true</failsOnError>
</configuration>
<executions>
<execution>
<id>validate</id>
<phase>validate</phase>
<goals>
<goal>checkstyle</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Loading

0 comments on commit 2f03a2f

Please sign in to comment.