Kuvasz-streamer is an open source change data capture (CDC) project that focuses exclusively on Postgres. It is tightly integrated with Postgres Logical Replication to provide high performance, low latency replication.
Kuvasz-streamer is a lightweight service written in Go that has no dependencies and no queuing. Run it as a system service or in a Docker container. It can run in a full declarative mode where the configuration map is stored in a read-only YAML file and no files are written to disk. This mode is suitable for a CI/CD pipeline based configuration and a Kubernetes deployment. An interactive, database-backed mode is supported where the web interface can be used to modify the mapping configuration at runtime.
Kuvasz-streamer uses the Postgres COPY protocol to perform the initial sync and the logical replication protocol later.
It opens multiple connections to the destination database and load-shares among them.
It batches updates into separate transactions to significantly increase performance.
And in order not to overload a production database server, it also supports global rate-limiting.
Kuvasz-streamer was benchmarked at 10K tps with less than 1 second latency.
Kuvasz-streamer takes the pain out of managing publications and replications slots:
- It creates missing publications and replications slots on startup
- It adds and removes configured tables from publications automatically
- It performs a full sync whenever a new table is added
It is also fully observable providing Prometheus metrics and extensive logging.
Multiple table streaming modes are supported
- Clone: replicate the source table as-is
- Append-only: replicate the source table but don't delete any records
- History: Keep a full history of all changes with a timestamp
Full PostgreSQL support is guaranteed with an extensive test suite:
- All recent PostgreSQL versions
- from 12 to 17
- All data types
- Partitions
- Schemas
- Source tables can be in any database and in any schema
- Destination tables are in a single database and a single schema
The service provides an optional API and a web interface to easily manage publications and mapping.
Kuvasz-streamer can be used for data consolidation, major version upgrades and other cases.
In a microservices architecture, each service has its own database. Kuvasz-streamer consolidates all the database of all services into a single data warehouse. The schema in the data warehouse does not have to follow the same one as the original services.
In a sensitive multi-tenant environment, each tenant may be assigned a separate database to ensure that no cross-pollination of data occurs. Kuvasz-streamer can then be used to consolidate all the data in a single table with a tenant identifier to ease reporting.
In a typical microservice architecture, history data is kept to a minimum in order to provide quick query time and low latency to end users. However, historical data is important for AI/ML and reporting. kuvasz-streamer
implements a no-delete strategy to some tables that dows not propagate DELETE
operations. Example usage includes transaction tables and audit history tables.
Upgrading major versions of Postgres is a time-consuming task that requires substantial downtime. Kuvasz-streamer can be used to synchronize databases between different versions of Postgres and performing a quick switchover.
The documentation is available at https://streamer.kuvasz.io/
Check the Installation Guide in the documentation.
Detailed instructions are available in the Getting started section of the documentation
All ideas and discussions are welcome. We use the GitHub Discussions and Mattermost for that.
Add tests for your changes. Ensure the project builds and passes all tests.