Skip to content

Kuvasz-Streamer is a Postgres-to-Postgres data consolidation and change data capture project.

License

Notifications You must be signed in to change notification settings

kuvasz-io/kuvasz-streamer

Repository files navigation

Kuvasz-Streamer

Kuvasz-streamer is an open source change data capture (CDC) project that focuses exclusively on Postgres. It is tightly integrated with Postgres Logical Replication to provide high performance, low latency replication.

Features

Lightweight

Kuvasz-streamer is a lightweight service written in Go that has no dependencies and no queuing. Run it as a system service or in a Docker container. It can run in a full declarative mode where the configuration map is stored in a read-only YAML file and no files are written to disk. This mode is suitable for a CI/CD pipeline based configuration and a Kubernetes deployment. An interactive, database-backed mode is supported where the web interface can be used to modify the mapping configuration at runtime.

High-performance

Kuvasz-streamer uses the Postgres COPY protocol to perform the initial sync and the logical replication protocol later.

It opens multiple connections to the destination database and load-shares among them.

It batches updates into separate transactions to significantly increase performance.

And in order not to overload a production database server, it also supports global rate-limiting.

Kuvasz-streamer was benchmarked at 10K tps with less than 1 second latency.

Batteries included

Kuvasz-streamer takes the pain out of managing publications and replications slots:

  • It creates missing publications and replications slots on startup
  • It adds and removes configured tables from publications automatically
  • It performs a full sync whenever a new table is added

It is also fully observable providing Prometheus metrics and extensive logging.

Flexible

Multiple table streaming modes are supported

  • Clone: replicate the source table as-is
  • Append-only: replicate the source table but don't delete any records
  • History: Keep a full history of all changes with a timestamp

Full Postgres support

Full PostgreSQL support is guaranteed with an extensive test suite:

  • All recent PostgreSQL versions
    • from 12 to 17
  • All data types
  • Partitions
  • Schemas
    • Source tables can be in any database and in any schema
    • Destination tables are in a single database and a single schema

API and web interface

The service provides an optional API and a web interface to easily manage publications and mapping.

Use cases

Kuvasz-streamer can be used for data consolidation, major version upgrades and other cases.

Microservice database consolidation

In a microservices architecture, each service has its own database. Kuvasz-streamer consolidates all the database of all services into a single data warehouse. The schema in the data warehouse does not have to follow the same one as the original services.

Multitenant database consolidation

In a sensitive multi-tenant environment, each tenant may be assigned a separate database to ensure that no cross-pollination of data occurs. Kuvasz-streamer can then be used to consolidate all the data in a single table with a tenant identifier to ease reporting.

Database performance optimization

In a typical microservice architecture, history data is kept to a minimum in order to provide quick query time and low latency to end users. However, historical data is important for AI/ML and reporting. kuvasz-streamer implements a no-delete strategy to some tables that dows not propagate DELETE operations. Example usage includes transaction tables and audit history tables.

Postgres major version upgrade

Upgrading major versions of Postgres is a time-consuming task that requires substantial downtime. Kuvasz-streamer can be used to synchronize databases between different versions of Postgres and performing a quick switchover.

Documentation

The documentation is available at https://streamer.kuvasz.io/

Installation

Check the Installation Guide in the documentation.

Getting started

Detailed instructions are available in the Getting started section of the documentation

Discuss

All ideas and discussions are welcome. We use the GitHub Discussions and Mattermost for that.

Pull Request Process

Add tests for your changes. Ensure the project builds and passes all tests.