One repo to finally have a clear, objective gRPC benchmark with code for everyone to verify and improve.
Contributions are most welcome!
The goal of this benchmark is to compare the performance and resource usage of various gRPC libraries across different programming languages and technologies. To achieve that, a minimal protobuf contract is used to not pollute the results with other concepts (e.g. performances of hash maps) and to make the implementations simple.
That being said, the service implementations should NOT take advantage of that and keep the code generic and maintainable. No inline assembly or other, language specific tricks / hacks. What does generic mean? One should be able to easily adapt the existing code to some fundamental use cases (e.g. having a thread-safe hash map on server side to provide values to client given some key).
Although in the end results are sorted according to the number of requests served, one should go beyond and look at the resource usage - perhaps one implementation is slightly better in terms of raw speed but uses three times more CPU to achieve that. Maybe it's better to take the first one if you're running on a Raspberry Pi and want to get the most of it. Maybe it's better to use the latter in a big server with 32 CPUs because it scales. It all depends on your use case. This benchmark is created to help people make an informed decision (and get ecstatic when their favourite technology seems really good, without doubts).
We try to provide some metrics to make this decision easier:
- req/s - the number of requests the service was able to successfully serve
- average latency, and 90/95/99 percentiles - time from sending a request to receiving the response
- average CPU, memory - average resource usage during the benchmark, as reported by
docker stats
- Completeness of the gRPC library. We test only basic unary RPC at the moment. This is the most common service method which may be enough for some business use cases, but not for the others. When you're happy about the results of some technology, you should check out it's documentation (if it exists) and decide yourself if is it production-ready.
- Taste. Some may find beauty in Ruby, some may feel like Java is the only real deal. Others treat languages as tools and don't care at all. We don't judge (officially 😉 ). Unless it's a huge state machine with raw
void
pointers. Ups!
Linux or MacOS with Docker. Keep in mind that the results on MacOS may not be that reliable, Docker for Mac runs on a VM.
To build the benchmarks images use: ./build.sh [BENCH1] [BENCH2] ...
. You need them to run the benchmarks.
To run the benchmarks use: ./bench.sh [BENCH1] [BENCH2] ...
. They will be run sequentially.
To clean-up the benchmark images use: ./clean.sh [BENCH1] [BENCH2] ...
The benchmark can be configured through the following environment variables:
Name | Description | Default value |
---|---|---|
GRPC_BENCHMARK_DURATION | Duration of the benchmark. | 30s |
GRPC_SERVER_CPUS | Maximum number of cpus used by the server. | 1 |
GRPC_SERVER_RAM | Maximum memory used by the server. | 512m |
GRPC_CLIENT_CONNECTIONS | Number of connections to use. | 5 |
GRPC_CLIENT_CONCURRENCY | Number of requests to run concurrently. It can't be smaller than the number of connections. | 50 |
GRPC_CLIENT_QPS | Rate limit, in queries per second (QPS). | 0 (unlimited) |
GRPC_CLIENT_CPUS | Maximum number of cpus used by the client. | 1 |
You can find our sample results in the Wiki. Be sure to run the benchmarks yourself if you have sufficient hardware, especially for multi-core scenarios.