README: Update the Postgres cluster schema (vitabaks#593)

zsolution-vn · Mar 4, 2024 · 8da0480 · 8da0480
1 parent 2cc4d6a
commit 8da0480
Show file tree

Hide file tree

Showing 3 changed files with 39 additions and 80 deletions.
diff --git a/README.md b/README.md
@@ -9,103 +9,65 @@
 
 ### Production-ready PostgreSQL High-Availability Cluster (based on "Patroni" and DCS "etcd" or "consul"). Automating with Ansible.
 
-This Ansible playbook is designed for deploying a PostgreSQL high availability cluster on dedicated physical servers for a production environment. The cluster can also be deployed on virtual machines and in the Cloud.
+The **postgresql_cluster** project is designed to deploy and manage high-availability PostgreSQL clusters in production environments. This solution is tailored for use on dedicated physical servers, virtual machines, and within both on-premises and cloud-based infrastructures.
 
-In addition to deploying new clusters, this playbook also support the deployment of cluster over already existing and running PostgreSQL. You can convert your basic PostgreSQL installation to a high availability cluster. Just specify the variable `postgresql_exists='true'` in the inventory file.
-**Attention!** Your PostgreSQL will be stopped before running in cluster mode (please plan for a short downtime of databases).
+This project not only facilitates the creation of new clusters but also offers support for integrating with pre-existing PostgreSQL instances. If you intend to upgrade your conventional PostgreSQL setup to a high-availability configuration, then just set `postgresql_exists=true` in the inventory file. Be aware that initiating cluster mode requires temporarily stopping your existing PostgreSQL service, which will lead to a brief period of database downtime. Please plan this transition accordingly.
+
+:trophy: **Use the [sponsoring](https://github.com/vitabaks/postgresql_cluster#sponsor-this-project) program to get personalized support, or just to contribute to this project.**
 
 ---
 
-:trophy: **Use the [sponsoring](https://github.com/vitabaks/postgresql_cluster#sponsor-this-project) program to get personalized support, or just to contribute to this project.**
+### Supported setups of Postgres Cluster
 
-## Index
-- [Cluster types](#cluster-types)
-    - [[Type A] PostgreSQL High-Availability with HAProxy Load Balancing](#type-a-postgresql-high-availability-with-haproxy-load-balancing)
-    - [[Type B] PostgreSQL High-Availability only](#type-b-postgresql-high-availability-only)
-    - [[Type C] PostgreSQL High-Availability with Consul Service Discovery (DNS)](#type-c-postgresql-high-availability-with-consul-service-discovery-dns)
-- [Compatibility](#compatibility)
-    - [Supported Linux Distributions:](#supported-linux-distributions)
-    - [PostgreSQL versions:](#postgresql-versions)
-    - [Ansible version](#ansible-version)
-- [Requirements](#requirements)
-- [Port requirements](#port-requirements)
-- [Recommendations](#recommenations)
-- [Deployment: quick start](#deployment-quick-start)
-- [Variables](#variables)
-- [Cluster Scaling](#cluster-scaling)
-    - [Steps to add a new postgres node](#steps-to-add-a-new-postgres-node)
-    - [Steps to add a new balancer node](#steps-to-add-a-new-balancer-node)
-- [Restore and Cloning](#restore-and-cloning)
-    - [Create cluster with pgBackRest:](#create-cluster-with-pgbackrest)
-    - [Create cluster with WAL-G:](#create-cluster-with-wal-g)
-    - [Point-In-Time-Recovery:](#point-in-time-recovery)
-- [Maintenance](#maintenance)
-    - [Changing PostgreSQL configuration parameters](#changing-postgresql-configuration-parameters) 
-    - [Update the PostgreSQL HA Cluster](#update-the-postgresql-ha-cluster)
-    - [PostgreSQL major upgrade](#postgresql-major-upgrade)
-- [Disaster Recovery](#disaster-recovery)
-    - [etcd](#etcd)
-    - [PostgreSQL (databases)](#postgresql-databases)
-- [How to start from scratch](#how-to-start-from-scratch)
-- [License](#license)
-- [Author](#author)
-- [Sponsor this project](#sponsor-this-project)
-- [Feedback, bug-reports, requests, ...](#feedback-bug-reports-requests-)
-
-## Cluster types
+![postgresql_cluster](images/postgresql_cluster.png)
 
 You have three schemes available for deployment:
 
-### [Type A] PostgreSQL High-Availability with HAProxy Load Balancing
-![TypeA](images/TypeA.png)
+#### 1. PostgreSQL High-Availability only
 
-> To use this scheme, specify `with_haproxy_load_balancing: true` in variable file vars/main.yml
+This is simple scheme without load balancing (used by default).
 
-This scheme provides the ability to distribute the load on reading. This also allows us to scale out the cluster (with read-only replicas).
+##### Components of high availability:
 
-- port 5000 (read / write) master
-- port 5001 (read only) all replicas
+- [**Patroni**](https://github.com/zalando/patroni) is a template for you to create your own customized, high-availability solution using Python and - for maximum accessibility - a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes. Used for automate the management of PostgreSQL instances and auto failover.
 
-###### if variable "synchronous_mode" is 'true' (vars/main.yml):
-- port 5002 (read only) synchronous replica only
-- port 5003 (read only) asynchronous replicas only
+- [**etcd**](https://github.com/etcd-io/etcd) is a distributed reliable key-value store for the most critical data of a distributed system. etcd is written in Go and uses the [Raft](https://raft.github.io/) consensus algorithm to manage a highly-available replicated log. It is used by Patroni to store information about the status of the cluster and PostgreSQL configuration parameters.
 
-> :heavy_exclamation_mark: Your application must have support sending read requests to a custom port (ex 5001), and write requests (ex 5000).
+[What is Distributed Consensus?](http://thesecretlivesofdata.com/raft/)
 
-##### Components of high availability:
-[**Patroni**](https://github.com/zalando/patroni) is a template for you to create your own customized, high-availability solution using Python and - for maximum accessibility - a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes. Used for automate the management of PostgreSQL instances and auto failover.
+To provide a single entry point (VIP) for database access is used "vip-manager".
 
-[**etcd**](https://github.com/etcd-io/etcd) is a distributed reliable key-value store for the most critical data of a distributed system. etcd is written in Go and uses the [Raft](https://raft.github.io/) consensus algorithm to manage a highly-available replicated log. It is used by Patroni to store information about the status of the cluster and PostgreSQL configuration parameters.
+- [**vip-manager**](https://github.com/cybertec-postgresql/vip-manager) (_optional, if the `cluster_vip` variable is specified_) is a service that gets started on all cluster nodes and connects to the DCS. If the local node owns the leader-key, vip-manager starts the configured VIP. In case of a failover, vip-manager removes the VIP on the old leader and the corresponding service on the new leader starts it there.
 
-[What is Distributed Consensus?](http://thesecretlivesofdata.com/raft/)
+- [**PgBouncer**](https://pgbouncer.github.io/features.html) (optional, if the `pgbouncer_install` variable is `true`) is a connection pooler for PostgreSQL.
 
-##### Components of load balancing:
-[**HAProxy**](http://www.haproxy.org/) is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. 
+#### 2. PostgreSQL High-Availability with HAProxy Load Balancing
 
-[**confd**](https://github.com/kelseyhightower/confd) manage local application configuration files using templates and data from etcd or consul. Used to automate HAProxy configuration file management.
+To use this scheme, specify `with_haproxy_load_balancing: true` in variable file vars/main.yml
 
-[**Keepalived**](https://github.com/acassen/keepalived) provides a virtual high-available IP address (VIP) and single entry point for databases access.
-Implementing VRRP (Virtual Router Redundancy Protocol) for Linux.
-In our configuration keepalived checks the status of the HAProxy service and in case of a failure delegates the VIP to another server in the cluster.
+This scheme provides the ability to distribute the load on reading. This also allows us to scale out the cluster (with read-only replicas).
 
-[**PgBouncer**](https://pgbouncer.github.io/features.html) is a connection pooler for PostgreSQL.
+- port 5000 (read / write) master
+- port 5001 (read only) all replicas
 
+###### if variable "synchronous_mode" is 'true' (vars/main.yml):
+- port 5002 (read only) synchronous replica only
+- port 5003 (read only) asynchronous replicas only
 
-### [Type B] PostgreSQL High-Availability only
-![TypeB](images/TypeB.png)
+:heavy_exclamation_mark: Note: Your application must have support sending read requests to a custom port 5001, and write requests to port 5000.
 
-This is simple scheme without load balancing `Used by default`
+##### Components of load balancing:
 
-To provide a single entry point (VIP) for database access is used "vip-manager". If the variable `cluster_vip` is specified (optional).
+- [**HAProxy**](http://www.haproxy.org/) is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. 
 
-[**vip-manager**](https://github.com/cybertec-postgresql/vip-manager) is a service that gets started on all cluster nodes and connects to the DCS. If the local node owns the leader-key, vip-manager starts the configured VIP. In case of a failover, vip-manager removes the VIP on the old leader and the corresponding service on the new leader starts it there. \
-Written in Go. Cybertec Schönig & Schönig GmbH https://www.cybertec-postgresql.com
+- [**confd**](https://github.com/kelseyhightower/confd) manage local application configuration files using templates and data from etcd or consul. Used to automate HAProxy configuration file management.
 
+- [**Keepalived**](https://github.com/acassen/keepalived)  (_optional, if the `cluster_vip` variable is specified_) provides a virtual high-available IP address (VIP) and single entry point for databases access.
+Implementing VRRP (Virtual Router Redundancy Protocol) for Linux. In our configuration keepalived checks the status of the HAProxy service and in case of a failure delegates the VIP to another server in the cluster.
 
-### [Type C] PostgreSQL High-Availability with Consul Service Discovery (DNS)
-![TypeC](images/TypeC.png)
+#### 3. PostgreSQL High-Availability with Consul Service Discovery (DNS)
 
-> To use this scheme, specify `dcs_type: consul` in variable file vars/main.yml
+To use this scheme, specify `dcs_type: consul` in variable file vars/main.yml
 
 This scheme is suitable for master-only access and for load balancing (using DNS) for reading across replicas. Consul [Service Discovery](https://developer.hashicorp.com/consul/docs/concepts/service-discovery) with [DNS resolving ](https://developer.hashicorp.com/consul/docs/discovery/dns) is used as a client access point to the database.
 
@@ -120,7 +82,7 @@ Example: `replica.postgres-cluster.service.dc1.consul`, `replica.postgres-cluste
 
 It requires the installation of a consul in client mode on each application server for service DNS resolution (or use [forward DNS](https://developer.hashicorp.com/consul/tutorials/networking/dns-forwarding?utm_source=docs) to the remote consul server instead of installing a local consul client).
 
----
+
 ## Compatibility
 RedHat and Debian based distros (x86_64)
 
@@ -218,15 +180,14 @@ If you’d prefer a cross-data center setup, where the replicating databases are
 There are quite a lot of things to consider if you want to create a really robust etcd cluster, but there is one rule: *do not placing all etcd members in your primary data center*. See some [examples](https://www.cybertec-postgresql.com/en/introduction-and-how-to-etcd-clusters-for-patroni/).
 
 
-- **How to prevent data loss in case of autofailover (synchronous_modes and pg_rewind)**:
+- **How to prevent data loss in case of autofailover (synchronous_modes)**:
 
 Due to performance reasons, a synchronous replication is disabled by default.
 
 To minimize the risk of losing data on autofailover, you can configure settings in the following way:
 - synchronous_mode: 'true'
 - synchronous_mode_strict: 'true'
 - synchronous_commit: 'on' (or 'remote_apply')
-- use_pg_rewind: 'false' (enabled by default)
 
 ---
 
@@ -591,13 +552,6 @@ A new installation can now be made from scratch.
 
 ---
 
-## License
-Licensed under the MIT License. See the [LICENSE](./LICENSE) file for details.
-
-## Author
-Vitaliy Kukharik (PostgreSQL DBA) \
-[email protected]
-
 ## Sponsor this project
 
 Join our sponsorship program to directly contribute to our project's growth and gain exclusive access to personalized support. Your sponsorship is crucial for innovation and progress. Become a sponsor today!
@@ -614,7 +568,12 @@ Support our work through a crypto wallet:
 
 USDT (TRC20): `TSTSXZzqDCUDHDjZwCpuBkdukjuDZspwjj`
 
----
+## License
+Licensed under the MIT License. See the [LICENSE](./LICENSE) file for details.
+
+## Author
+Vitaliy Kukharik (PostgreSQL DBA) \
+[email protected]
 
 ## Feedback, bug-reports, requests, ...
 Are [welcome](https://github.com/vitabaks/postgresql_cluster/issues)!
diff --git a/images/postgresql_cluster.dark_mode.png b/images/postgresql_cluster.dark_mode.png
diff --git a/images/postgresql_cluster.png b/images/postgresql_cluster.png