The goal of this project is to measure the performance for querying a PostgreSQL database from a Java program. The Java Micro/Milli/Macro Harness (aka JMH) is used to perform the benchmarks. At the same time, the pg_stat_statements extension is used to measure the query times at the Postgres level.
With these tools, some basic queries are performed on a table and the performance -both at the Java and database side- is measured. This allow us to compute the overhead incurred by the Java driver and all the network traffic. To minimize the application overhead, the data is read and then discarded (using JMH's BlackHole, to avoid potential JVM optimizations).
This code has been used to support the content of some technical talks delivered by @ahachete on some meetups and conferences like JUG.RU or PGday.RU. You may check the uploaded slides and watch a recorded video of these talks.
-
Install
pg_stat_statements
to PostgreSQL. Editpostgresql.conf
to addpg_stat_statements
toshared_preload_libraries
. If in doubt, check the documentation. -
Connect to a PostgreSQL database and run there the
generate_data.sql
script. That will require approximately 1GB of disk space. -
Copy
src/main/resources/db.properties.sample
file tosrc/main/resources/db.properties
and edit at your convenience. -
Compile the code with maven:
mvn package
. The compiled package will betarget/benchmarks.jar
.
If you want to simply run the benchmarks, run the helper script:
run_benchmark.sh
This script generates output in the results/
folder. You may of course run individual tests manually:
java -jar target/benchmarks.jar <test_name>
Output will be written to stdout.
You may also use a Java sampling profiler to perform both normal and profile runs of the benchmark by specifying the -profile
flag to run_benchmark.sh
and setting up some environment variables:
PROFILE_AGENT_PATH=/path/to/liblagent.so PROFILE_OUTPUT_FILE=traces.txt ./run_benchmark.sh -profile
where the above command will be the required command to run the benchmarks while profiling with lightweight-java-profiler. All the output will be found in the results/
folder.
It follows the results obtained on one benchmark. The benchmark was run on a dedicated m3.large instance on AWS (with no other VMs on the same physical host), without profiling enabled:
Test | Postgres (ms) | Java (ms) | Overhead (%) |
---|---|---|---|
_1_Int | 3716.57 | 4032.06 | 8.49% |
_2_String | 3691.12 | 6698.53 | 81.48% |
_3_IntString | 5468.97 | 8842.70 | 61.69% |
_4_IntStringJson | 8935.48 | 19751.98 | 121.05% |
_5_IntStringColumnNumber | 6018.63 | 9143.73 | 51.92% |
_6_String_NoAutocommit | 3076.55 | 4293.38 | 39.55% |
Please note that test execution times vary significantly from run to run, and thus some results may look like inconsistent. But the main takeaway is that there is an overhead, and this overhead is consistently present.
It is also important to note that by specifying an explicitd fetch size and setting noAutoCommit (see test 6) the execution time improves significantly (and overhead is less) due to a much reduced GC.