Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
danielmitterdorfer committed Dec 1, 2015
0 parents commit bd36874
Show file tree
Hide file tree
Showing 29 changed files with 1,929 additions and 0 deletions.
97 changes: 97 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
## https://github.com/github/gitignore/blob/master/Global/OSX.gitignore

.DS_Store
.AppleDouble
.LSOverride

# Icon must end with two \r
Icon


# Thumbnails
._*

# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns

# Directories potentially created on remote AFP share
.AppleDB
.AppleDesktop
Network Trash Folder
Temporary Items
.apdisk

## kinda based on https://github.com/github/gitignore/blob/master/Global/JetBrains.gitignore

*.iml

## Directory-based project format:
.idea/

## https://github.com/github/gitignore/blob/master/Python.gitignore

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log

# Sphinx documentation
docs/_build/

# PyBuilder
target/

40 changes: 40 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Rally is the macrobenchmarking framework for Elasticsearch

### Prerequisites

* Python 3.4+ available as `python3` on the path
* Python Elasticsearch client. Install via `pip3 install elasticsearch`
* (Optional) Python psutil library. Install via `pip3 install psutil`
* JDK 8+
* Gradle 2.8+
* git

Rally is only tested on Mac OS X and Linux.

### Getting started

* Clone this repo: `git clone [email protected]:elastic/rally.git`
* Open `rally/config.py` in the editor of your choice and change the configuration. This is not really convenient but we're getting
there, promise. :) The idea is to have a setup script that will ask for those values and put them in `~.rally/rally.cfg`.
* Run rally from the root project directory: `python3 rally/rally.py`.

### Command Line Options

Rally provides a list of supported command line options when it is invoked with `--help`.

### Key Components of Rally

Note: This is just important if you want to hack on Rally and to some extend if you want to add new benchmarks. It is not that interesting if you are just using it.

* `Series`: represents a class of benchmarking scenarios, e.g. a logging benchmark. It defines the data set to use.
* `Track`: A track is a concrete benchmark configuration, e.g. the logging benchmark with Elasticsearch default settings.
* `Mechanic`: A mechanic can build and prepare a benchmark candidate for the race. It checks out the source, builds Elasticsearch, provisions and starts the cluster.
* `Race Control`: Race control is responsible for proper setup of the race.
* `Telemetry`: Telemetry allows us to gather metrics during the race.
* `Driver`: drives the race, i.e. it is executing the benchmark according to the track specification.
* `Reporter`: A reporter tells us how the race went (currently only after the fact).

When implementing a new benchmark, create a new file in `track` and subclass `Series` and `Track`. See `track/logging_track.py` for an example.
Currently, race control does not pick up the new benchmark automatically but adding support for that is coming soon.

TODO dm: Add a nice diagram for a graphical overview of the key components and their dependencies
55 changes: 55 additions & 0 deletions TODO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
### Immediate

* Fix report generation (needs to write everything properly to one file per track)
* Port latest changes from dev repo

---

### After that

* Support for running locally on dev machines:
* Externalize configuration -> hardcoded values must get out of config.py
* Easier setup / configuration and clear docs on how to get started (use setup.py for installing dependencies!)
* Support for downloading the benchmark file directly from rally (without boto!)
* Nice to have: Command line reporter showing a metrics summary (in addition or instead of graphs)

* Support for running nightly benchmarks:
* Verification of rally against the original version (compare log files (single-threaded test)?)
* EC2 benchmarks
* Various minor bits like uploading the report to S3

* Backtesting support:
* Maven build
* Choose Java and build tool based on timestamp, commit id, ... . -> Already fleshed out in gear/gear.py
* Iteration loop around race control

* Triggering via Jenkins (at random times, how?)

* Rally auto-update (run some kind of "pre-script"? Maybe also in Jenkins?)

* Support for multiple benchmarks (not much missing for this one, structure already in place),
* Pick up benchmarks automatically
* Add an iteration loop in race control so iterates over multiple benchmarks
* Proper reporting for multiple benchmarks -> Spice up reporting by allowing multiple benchmarks with a menu structure
like in http://getbootstrap.com/examples/navbar/)

* Open up the repo
* How can we split benchmark development from rally? (-> logging benchmark shouldn't be directly in rally)


### Further Ideas

* Support additional JVM options for the benchmark candidate by setting "ES_GC_OPTS" (e.g. for benchmarking with G1)
* Warn if there is not enough disk space on your ROOT_DIR (-> for data files)
* Introduce a tournament mode (candidate vs. baseline)
* Conceptual topics:
* Test thoroughly for bottlenecks in every place (I/O, CPU, benchmark driver, metrics gathering, etc. etc.)
* Account for warmup, multiple benchmark iterations (-> also check JIT compiler logs of benchmark candidate)
* Randomization of order in which benchmarks are run
* Account for coordinated omission
* Metrics reporting (latency distribution, not mean)
* Physically isolate benchmark driver from benchmark candidate
* Add ability to dig deeper (flamegraphs etc.)
* Add scalability benchmarks

TODO dm: Create Github issues
Empty file added rally/__init__.py
Empty file.
Empty file added rally/cluster/__init__.py
Empty file.
50 changes: 50 additions & 0 deletions rally/cluster/cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
import time
import socket
import elasticsearch

from enum import Enum


class ClusterStatus(Enum):
red = 1
yellow = 2
green = 3


# Represents the test candidate (i.e. Elasticsearch)
class Cluster:
def __init__(self, servers):
self._es = elasticsearch.Elasticsearch()
self._servers = servers

def servers(self):
return self._servers

# Just expose the client API directly (for now)
def client(self):
return self._es

def wait_for_status(self, cluster_status):
cluster_status_name = cluster_status.name
print('\nTEST: wait for %s cluster...' % cluster_status_name)
es = self._es
t0 = time.time()
while True:
try:
result = es.cluster.health(wait_for_status=cluster_status_name, wait_for_relocating_shards=0, timeout='1s')
except (socket.timeout, elasticsearch.exceptions.ConnectionError, elasticsearch.exceptions.TransportError):
pass
else:
print('GOT: %s' % str(result))
print('ALLOC:\n%s' % es.cat.allocation(v=True))
print('RECOVERY:\n%s' % es.cat.recovery(v=True))
print('SHARDS:\n%s' % es.cat.shards(v=True))
if result['status'] == cluster_status_name and result['relocating_shards'] == 0:
break
else:
time.sleep(0.5)

print('\nTEST: %s cluster done (%.1f sec)' % (cluster_status_name, time.time() - t0))
print('\nTEST: cluster health: %s' % str(es.cluster.health()))
print('SHARDS:\n%s' % es.cat.shards(v=True))

80 changes: 80 additions & 0 deletions rally/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
from enum import Enum


class ConfigError(BaseException):
pass


class Scope(Enum):
# Valid for all benchmarks
globalScope = 1
# A sole benchmark
benchmarkScope = 2
# Single benchmark track
trackScope = 3
# property for every invocation, i.e. for backtesting
invocationScope = 4


# TODO dm: Explicitly clean all values after they've lost their scope to avoid leaving behind outdated entries by accident
# Abstracts the configuration format.
class Config:
# TODO dm: Later we'll use ConfigParser, for now it's just a map. ConfigParser uses sections and keys, we separate sections from the key
# with two double colons.
_opts = {
"source::local.src.dir": "/Users/dm/Downloads/scratch/rally/elasticsearch",
"source::remote.repo.url": "[email protected]:elastic/elasticsearch.git",
#TODO dm: Add support for Maven (-> backtesting)
"build::gradle.bin": "/usr/local/bin/gradle",
"build::gradle.tasks.clean": "clean",
# "build::gradle.tasks.package": "check -Dtests.seed=0 -Dtests.jvms=12",
# We just build the ZIP distribution directly for now (instead of the 'check' target)
"build::gradle.tasks.package": "assemble",
"build::log.dir": "/Users/dm/Downloads/scratch/rally/build_logs",
# Where to install the benchmark candidate, i.e. Elasticsearch
"provisioning::local.install.dir": "/Users/dm/Downloads/scratch/rally/install",

"runtime::java8.home": "/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home",
# TODO dm: Add also java7.home (and maybe we need to be more fine-grained, such as "java7update25.home" but we'll see..

# Where to download raw benchmark datasets?
"benchmarks::local.dataset.cache": "/Users/dm/Projects/data/benchmarks",
"benchmarks::metrics.stats.disk.device": "/dev/disk1",
# separate directory from output file name for now, we may want to gather multiple reports and they should not override each other
"reporting::report.base.dir": "/Users/dm/Downloads/scratch/rally/reports",
# We may want to consider output formats (console (summary), html, ES, ...)
"reporting::output.html.report.filename": "index.html"
}

def __init__(self):
pass

def add(self, scope, section, key, value):
self._opts[self._k(scope, section, key)] = value

def opts(self, section, key, default_value=None, mandatory=True):
try:
scope = self._resolve_scope(section, key)
return self._opts[self._k(scope, section, key)]
except KeyError:
if not mandatory:
return default_value
else:
raise ConfigError("No value for mandatory configuration: section='%s', key='%s'" % (section, key))

# recursively find the most narrow scope for a key
def _resolve_scope(self, section, key, start_from=Scope.invocationScope):
if self._k(start_from, section, key) in self._opts:
return start_from
elif start_from == Scope.globalScope:
return None
else:
# continue search in the enclosing scope
return self._resolve_scope(section, key, Scope(start_from.value - 1))

def _k(self, scope, section, key):
# keep global config keys a bit shorter / nicer for now
if scope is None or scope == Scope.globalScope:
return "%s::%s" % (section, key)
else:
return "%s::%s::%s" % (scope.name, section, key)
Empty file added rally/driver/__init__.py
Empty file.
21 changes: 21 additions & 0 deletions rally/driver/driver.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
import track.track as track
import cluster.cluster as cluster
import telemetry.metrics as m


# Benchmark runner
class Driver:
def setup(self, cluster, track):
track.setup_benchmark(cluster)
cluster.wait_for_status(track.required_cluster_status())

def go(self, cluster, track):
metrics = m.MetricsCollector()
# TODO dm: This is just here to ease the migration, consider gathering metrics for *all* tracks later
if track.requires_metrics():
metrics.startCollection(cluster)
# TODO dm: I sense this is too concrete for a driver -> abstract this a bit later
track.benchmark_indexing(cluster, metrics)
metrics.stopCollection()
# TODO dm: *Might* be interesting to gather metrics also for searching (esp. memory consumption) -> later
track.benchmark_searching(cluster)
Empty file added rally/mechanic/__init__.py
Empty file.
49 changes: 49 additions & 0 deletions rally/mechanic/builder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
import os
import glob
import config

import utils.io as io

# can build an actual source tree
#
# Idea: Think about a "skip-build" flag for local use (or a pre-build check whether there is already a binary (prio: low)
class Builder:
def __init__(self, config, logger):
self._config = config
self._logger = logger

def build(self):
# just Gradle is supported for now
self._clean()
self._package()

def _clean(self):
self._exec("gradle.tasks.clean")

def _package(self):
self._exec("gradle.tasks.package")

def _exec(self, task_key):
src_dir = self._config.opts("source", "local.src.dir")
gradle = self._config.opts("build", "gradle.bin")
task = self._config.opts("build", task_key)
dry_run = self._config.opts("system", "dryrun")
# store logs for each invocation in a dedicated directory
s = self._config.opts("meta", "time.start")
timestamp = '%04d-%02d-%02d-%02d-%02d-%02d' % (s.year, s.month, s.day, s.hour, s.minute, s.second)
log_dir = "%s/%s" % (self._config.opts("build", "log.dir"), timestamp)

self._logger.info("Executing %s %s..." % (gradle, task))
if not dry_run:
io.ensure_dir(log_dir)
# TODO dm: How should this be called?
log_file = "%s/build.%s.log" % (log_dir, task_key)

# FIXME dm: This is just disabled to skip the build for now. Reenable me again later
#if not os.system("cd %s; %s %s > %s.tmp 2>&1" % (src_dir, gradle, task, log_file)):
# os.rename(("%s.tmp" % log_file), log_file)

binary = glob.glob("%s/distribution/zip/build/distributions/*.zip" % src_dir)[0]

# TODO dm: Not entirely sure, but it should be invocation scope as it might change across checkouts?
self._config.add(config.Scope.invocationScope, "builder", "candidate.bin.path", binary)
Loading

0 comments on commit bd36874

Please sign in to comment.