thingylabs/postgres

By thingylabs

Updated over 7 years ago

postgres cluster solution

Image
0

726

PostDock - Postgres + Docker

Postgres streaming replication cluster for docker swarm (Docker Swarm)

Schema of the example cluster

...

Start cluster with docker-compose

To start cluster run it as normal docker-compose application docker-compose up -d

Please check comments for each ENV variable in docker-compose.yml file to understand parameter for each cluster node

Configuring the cluster

You can configure any node of the cluster(postgres.conf) or pgpool(pgpool.conf) with ENV variable CONFIGS (format: variable1:value1[,variable2:value2[,...]]). Also see the Dockerfiles and docker-compose.yml files in the root of the repository to understand all available and used configurations!

Postgres

For the rest - you better follow the advise and look into the Postgres-latest.Dockerfile file - it full of comments :)

Pgpool

The most important part to configure in Pgpool (apart of general CONFIGS) is backends and users which could access these backends. You can configure backends with ENV variable. You can find good example of setting up pgpool in docker-compose.yml file:

DB_USERS: monkey_user:<monkey_pass|monkey_pass_file> # in format user:password[,user:password[...]]
BACKENDS: "0:pgmaster:5432:1:/var/lib/postgresql/data:ALLOW_TO_FAILOVER,1:pgslave1::::,3:pgslave3::::,2:pgslave2::::" #,4:pgslaveDOES_NOT_EXIST::::
            # in format num:host:port:weight:data_directory:flag[,...]
            # defaults:
            #   port: 5432
            #   weight: 1
            #   data_directory: /var/lib/postgresql/data
            #   flag: ALLOW_TO_FAILOVER
REQUIRE_MIN_BACKENDS: 3 # minimal number of backends to start pgpool (some might be unreachable)
Other configurations

See the Dockerfiles and docker-compose.yml files in the root of the repository to understand all available and used configurations!

Adaptive mode

'Adaptive mode' means that node will be able to decide if instead of acting as a master on it's start or switch to standby role. That possible if you pass PARTNER_NODES (comma separated list of nodes in the cluster on the same level). So every time container starts it will check if it was master before and if there is no new master around (from the list PARTNER_NODES), otherwise it will start as a new standby node with upstream = new master in the cluster.

Keep in mind: this feature does not work for cascade replication and you should not pass PARTNER_NODES to nodes on second level of the cluster. Instead of it just make sure that all nodes on the first level are running, so after restart any node from second level will be able to follow initial upstream from the first level. That also can mean - replication from second level potentially can connect to root master... Well not a big deal if you've decided to go with adaptive mode. But nevertheless you are able to play with NODE_PRIORITY environment variable and make sure entry point for second level of replication will never be elected as a new root master

Replication slots

If you want to disable the feature of Postgres>=9.4 - replication slots simply set ENV variable USE_REPLICATION_SLOTS=0 (enabled by default). So cluster will rely only on Postgres configuration wal_keep_segments (500 by default). You also should remember that default number for configuration max_replication_slots is 5. You can change it (as any other configuration) with ENV variable CONFIGS.

Health-checks

To make sure you cluster works as expected without 'split-brain' or other issues, you have to setup health-checks and stop container if any health-check returns non-zero result.

  • Postgres containers:
    • /usr/local/bin/cluster/healthcheck/is_major_master.sh - detect if node acts as a 'false'-master and there is another master - with more standbys
  • Pgpool
    • /usr/local/bin/pgpool/has_enough_backends.sh [REQUIRED_NUM_OF_BACKENDS, default=$REQUIRE_MIN_BACKENDS] - check if there are enough backend behind pgpool
    • /usr/local/bin/pgpool/has_write_node.sh - check if one of the backend can be used as a master with write access

Abnormal but possible situation in cluster:

Role      | Name  | Upstream | Connection String
----------+-------|----------|---------------------------------------------------------------------------------------------------------------------
* master  | node1 |          | user=replica_user password=replica_pass host=mysystem-db-node1-service dbname=replica_db port=5432 connect_timeout=2
  standby | node4 | node2    | user=replica_user password=replica_pass host=mysystem-db-node4-service dbname=replica_db port=5432 connect_timeout=2
* master  | node2 |          | user=replica_user password=replica_pass host=mysystem-db-node2-service dbname=replica_db port=5432 connect_timeout=2
  standby | node3 | node2    | user=replica_user password=replica_pass host=mysystem-db-node3-service dbname=replica_db port=5432 connect_timeout=2
  standby | node5 | node4    | user=replica_user password=replica_pass host=mysystem-db-node5-service dbname=replica_db port=5432 connect_timeout=2

Useful commands

  • Get map of current cluster(on any postgres node):
    • gosu postgres repmgr cluster show - tries to connect to all nodes on request ignore status of node in repmgr.nodes
    • gosu postgres psql $REPLICATION_DB -c "SELECT * FROM repmgr.nodes" - just select data from tables
  • Get pgpool status (on any pgpool node): PGPASSWORD=$POSTGRES_PASSWORD psql -U $POSTGRES_USER -h localhost template1 -c "show pool_nodes"
  • In pgpool container check if primary node exists: /usr/local/bin/pgpool/has_write_node.sh

Any command might be wrapped with docker-compose or kubectl - docker-compose exec {NODE} bash -c '{COMMAND}' or kubectl exec {POD_NAME} -- bash -c '{COMMAND}'

FAQ

Documentation and manuals

Docker Pull Command

docker pull thingylabs/postgres