Skip to content

Latest commit

 

History

History
 
 

parquet-benchmarks

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Running Parquet Benchmarks

The Parquet benchmarks in this module are run using the OpenJDK Java Microbenchmarking Harness.

First, building the parquet-benchmarks module creates an uber-jar including the Parquet classes and all dependencies, and a main class to launch the JMH tool.

mvn --projects parquet-benchmarks -amd -DskipTests -Denforcer.skip=true clean package

JMH doesn't have the notion of "benchmark suites", but there are certain benchmarks that make sense to group together or to run in isolation during development. The ./parquet-benchmarks/run.sh script can be used to launch all or some benchmarks:

# More information about the run script and the available arguments.
./parquet-benchmarks/run.sh

# More information on the JMH options available.
./parquet-benchmarks/run.sh all -help

# Run every benchmark once (~20 minutes).
./parquet-benchmarks/run.sh all -wi 0 -i 1 -f 1

# A more rigourous run of all benchmarks, saving a report for comparison.
./parquet-benchmarks/run.sh all -wi 5 -i 5 -f 3 -rff /tmp/benchmark1.json

# Run a benchmark "suite" built into the script, with JMH defaults (about 30 minutes)
./parquet-benchmarks/run.sh checksum

# Running one specific benchmark using a regex.
./parquet-benchmarks/run.sh all org.apache.parquet.benchmarks.NestedNullWritingBenchmarks

# Manually clean up any state left behind from a previous run.
./parquet-benchmarks/run.sh clean