

“采菊东篱下,
悠然现南山”。
晋·陶渊明




"作者: 悠然 | 首发公众号: dba悠然(orasky666)"
<温馨提示:以下内容仅代表个人观点>






数据库高可用(High Availability)是保障业务连续性的重要手段之一。对于大多信创数据库而言,基本上都有较为成熟的高可用方案;而对于原生PG数据库,高可用方案却参差不齐,因此选型一个稳定可靠的高可用方案就显得至关重要了。


考虑这样的故障场景:
1)主库节点宕机无法启动;
2)备库因网络抖动超时/WAL日志GAP/实例重启恢复/误操作switchover等原因导致集群状态异常;
针对上述场景,有经验的dbaer都有可能踩坑。引用行业内的一句话来表达此情此景:“运维就像悬崖踩钢丝,稍有不慎就会万劫不复”。接下来就和笔者一起探索如何使用正确的“姿势”避坑(以PAF为例)。

PART2:正题
[postgres@monitor ~]$ pg_autoctl show stateName | Node | Host:Port | LSN | Reachable | Current State | Assigned State-------+-------+--------------------+-----------+-----------+---------------------+--------------------node_1 | 1 | 172.20.70.204:5432 | 0/706AA30 | yes | primary | primarynode_8 | 8 | 172.20.70.205:5432 | 0/706AA30 | yes | secondary | secondary--node1(node_1)为primary主库--node2(node_8)为secondary 从库
[postgres@monitor ~]$ pg_autoctl show stateName | Node | Host:Port | LSN | Reachable | Current State | Assigned State-------+-------+--------------------+-----------+-----------+---------------------+--------------------node_1 | 1 | 172.20.70.204:5432 | 0/706AA68 | no | primary | demotednode_8 | 8 | 172.20.70.205:5432 | 0/706C4F0 | yes | wait_primary | wait_primary--主库降级状态--备库wait_primary状态
[postgres@monitor pg_data]$ 09:18:21 51157 INFO Node 1 (172.20.70.204:5432) is marked as unhealthy by the monitor09:18:21 51157 INFO Setting goal state of node 1 "node_1" (172.20.70.204:5432) to catchingup after it became unhealthy.09:18:21 51157 INFO New state for node 1 "node_1" (172.20.70.204:5432): secondary ? catchingup09:18:21 51157 INFO Setting goal state of node 8 "node_8" (172.20.70.205:5432) to wait_primary because none of the 1 standby candidate nodes are healthy at the moment.09:18:21 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): primary ? wait_primary09:18:21 51157 INFO New state is reported by node 8 "node_8" (172.20.70.205:5432): "wait_primary"09:18:21 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): wait_primary ? wait_primarypg_autoctl show stateName | Node | Host:Port | LSN | Reachable | Current State | Assigned State-------+-------+--------------------+-----------+-----------+---------------------+--------------------node_1 | 1 | 172.20.70.204:5432 | 0/B000148 | no | secondary | catchingupnode_8 | 8 | 172.20.70.205:5432 | 0/B000148 | yes | wait_primary | wait_primary
postgres=# select pg_is_in_recovery();pg_is_in_recovery-------------------f(1 row)postgres=# select name,setting from pg_settings where name='transaction_read_only';name | setting-----------------------+---------transaction_read_only | off(1 row)
[postgres@db2 log]$ pg_autoctl perform failover09:20:07 27761 INFO Targetting group 0 in formation "default"09:20:07 27761 ERROR Monitor ERROR: couldn't find the primary node in formation "default", group 009:20:07 27761 ERROR SQL query: SELECT pgautofailover.perform_failover($1, $2)09:20:07 27761 ERROR SQL params: 'default', '0'09:20:07 27761 ERROR Failed to perform failover for formation default and group 009:20:07 27761 FATAL Failed to perform failover/switchover, see above for details'
[postgres@db2 log]$ pg_autoctl perform promotion09:21:07 27887 ERROR Monitor ERROR: cannot perform promotion: node node_8 in formation default has reported state "wait_primary" and is assigned state "wait_primary", promotion can only be performed when in state "secondary".09:21:07 27887 ERROR SQL query: SELECT pgautofailover.perform_promotion($1, $2)09:21:07 27887 ERROR SQL params: 'default', 'node_8'09:21:07 27887 ERROR Failed to perform failover for node node_8 in formation default
pg_autoctl drop node --hostname=172.20.70.204 --pgdata=/data/pg_data/ --pgport=5432[postgres@monitor pg_data]$ pg_autoctl drop node --hostname=172.20.70.204 --pgdata=/data/pg_data/ --pgport=543209:24:31 51157 INFO Removing node 1 "node_1" (172.20.70.204:5432) from formation "default" and group 009:24:31 51157 INFO Setting number_sync_standbys to 0 for formation "default" now that we have -1 standby nodes set with replication-quorum.09:24:31 51157 INFO Setting goal state of node 8 "node_8" (172.20.70.205:5432) to single as there is no other node.09:24:31 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): wait_primary ? single[postgres@monitor pg_data]$ 09:24:31 51157 INFO Setting goal state of node 8 "node_8" (172.20.70.205:5432) to single as there is no other node.09:24:31 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): wait_primary ? single09:24:31 51157 INFO New state is reported by node 8 "node_8" (172.20.70.205:5432): "single"09:24:31 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): single ? single
[postgres@monitor pg_data]$ pg_autoctl show stateName | Node | Host:Port | LSN | Reachable | Current State | Assigned State-------+-------+--------------------+-----------+-----------+---------------------+--------------------node_8 | 8 | 172.20.70.205:5432 | 0/B000148 | yes | single | single
[postgres@monitor pg_data]$ pg_autoctl perform promotion --name node_809:28:03 63653 WARN NOTICE: cannot perform promotion: node node_8 in formation default is already a primary.
[postgres@db1 ~]$ pg_ctl start -D data/pg_data/pg_ctl: another server might be running; trying to start server anywaywaiting for server to start....2024-05-15 09:32:51.902 CST [1466] LOG: starting PostgreSQL 12.16 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit2024-05-15 09:32:51.903 CST [1466] LOG: listening on IPv4 address "0.0.0.0", port 54322024-05-15 09:32:51.903 CST [1466] LOG: listening on IPv6 address "::", port 54322024-05-15 09:32:51.905 CST [1466] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"2024-05-15 09:32:51.906 CST [1466] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"2024-05-15 09:32:51.966 CST [1466] LOG: redirecting log output to logging collector process2024-05-15 09:32:51.966 CST [1466] HINT: Future log output will appear in directory "log".doneserver started[postgres@db1 ~]$[postgres@db1 ~]$[postgres@db1 ~]$ pwd/home/postgres[postgres@db1 ~]$ psqlpsql (12.16)Type "help" for help.
postgres=# select pg_is_in_recovery();pg_is_in_recovery-------------------t(1 row)postgres=# select name,setting from pg_settings where name='transaction_read_only';name | setting-----------------------+---------transaction_read_only | on(1 row)
--原主库node1手动升主脑裂[postgres@db1 ~]$ pg_ctl promotewaiting for server to promote.... doneserver promoted--查看是否还处于恢复模式(未处于恢复状态)postgres=# select pg_is_in_recovery();pg_is_in_recovery-------------------f(1 row)
#include 'postgresql-auto-failover-standby.conf' # Auto-generated by pg_auto_failover, do not remove#include 'postgresql-auto-failover.conf' # Auto-generated by pg_auto_failover, do not remove
mv postgresql-auto-failover.conf postgresql-auto-failover.conf.bakmv postgresql-auto-failover-standby.conf postgresql-auto-failover-standby.conf.bak
#hot_standby=off#synchronous_standby_names=''d)重启生效pg_ctl restart -m fast -D postgres/data
postgres=# select pg_is_in_recovery();pg_is_in_recovery-------------------f(1 row)postgres=# select name,setting from pg_settings where name='transaction_read_only';name | setting-----------------------+---------transaction_read_only | off(1 row)INSERT 0 2postgres=# create table t(id int);CREATE TABLEpostgres=# insert into t select generate_series(1,2);
[postgres@monitor pg_data]$ pg_autoctl show stateName | Node | Host:Port | LSN | Reachable | Current State | Assigned State-------+-------+--------------------+-----------+-----------+---------------------+--------------------node_8 | 8 | 172.20.70.205:5432 | 0/B000148 | yes | single | single
--删除旧的配置文件,否则报错无法加入集群,如下:"10:01:16 47610 FATAL The state file "/home/postgres/.local/share/pg_autoctl/data/pg_data/pg_autoctl.state" exists and there's no init in progress"rm -Rf home/postgres/.local--添加新从库到原备库node8(新集群primary)sudo -u postgres/usr/pgsql-12/bin/pg_autoctl create postgres --pgdata data/pg_data/ --auth trust --ssl-self-signed \--username postgres --dbname postgres --hostname 172.20.70.204 --pgctl usr/pgsql-12/bin/pg_ctl \--monitor 'postgres://autoctl_node@172.20.70.203:5432/pg_auto_failover?sslmode=require' --replication-quorum false--日志信息[postgres@db1 data]$ usr/pgsql-12/bin/pg_autoctl create postgres --pgdata data/pg_data/ --auth trust --ssl-self-signed \> --username postgres --dbname postgres --hostname 172.20.70.204 --pgctl usr/pgsql-12/bin/pg_ctl \> --monitor 'postgres://autoctl_node@172.20.70.203:5432/pg_auto_failover?sslmode=require' --replication-quorum false10:04:46 47615 INFO Using default --ssl-mode "require"10:04:46 47615 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic10:04:46 47615 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.10:04:46 47615 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details10:04:46 47615 INFO Started pg_autoctl postgres service with pid 4761810:04:46 47618 INFO usr/pgsql-12/bin/pg_autoctl do service postgres --pgdata data/pg_data/ -v10:04:46 47615 INFO Started pg_autoctl node-init service with pid 4761910:04:46 47619 INFO Registered node 9 (172.20.70.204:5432) with name "node_1" in formation "default", group 0, state "wait_standby"10:04:46 47619 INFO Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/data/pg_data/pg_autoctl.state"10:04:46 47619 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/data/pg_data/pg_autoctl.init"10:04:46 47619 INFO Successfully registered as "wait_standby" to the monitor.10:04:46 47619 INFO FSM transition from "init" to "wait_standby": Start following a primary10:04:46 47619 INFO Transition complete: current state is now "wait_standby"10:04:46 47619 INFO New state for node 8 "node_8" (172.20.70.205:5432): single ? wait_primary10:04:46 47619 INFO New state for node 8 "node_8" (172.20.70.205:5432): wait_primary ? wait_primary10:04:46 47619 INFO Still waiting for the monitor to drive us to state "catchingup"10:04:46 47619 WARN Please make sure that the primary node is currently running `pg_autoctl run` and contacting the monitor.10:04:46 47619 INFO FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby10:04:46 47619 INFO Initialising PostgreSQL as a hot standby10:04:46 47619 INFO usr/pgsql-12/bin/pg_basebackup -w -d application_name=pgautofailover_standby_9 host=172.20.70.205 port=5432 user=pgautofailover_replicator sslmode=require --pgdata data/backup/node_1 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_910:04:46 47619 INFO pg_basebackup: initiating base backup, waiting for checkpoint to complete10:04:46 47619 INFO pg_basebackup: checkpoint completed10:04:46 47619 INFO pg_basebackup: write-ahead log start point: 0/C000028 on timeline 1210:04:46 47619 INFO pg_basebackup: starting background WAL receiver10:04:46 47619 INFO 0/33603 kB (0%), 0/1 tablespace (/data/backup/node_1/backup_label )10:04:46 47619 INFO 33612/33612 kB (100%), 0/1 tablespace (.../backup/node_1/global/pg_control)10:04:46 47619 INFO 33612/33612 kB (100%), 1/1 tablespace10:04:46 47619 INFO pg_basebackup: write-ahead log end point: 0/C00010010:04:46 47619 INFO pg_basebackup: waiting for background process to finish streaming ...10:04:46 47619 INFO pg_basebackup: syncing data to disk ...10:04:46 47619 INFO pg_basebackup: base backup completed10:04:46 47619 INFO Creating the standby signal file at "/data/pg_data/standby.signal", and replication setup at "/data/pg_data/postgresql-auto-failover-standby.conf"10:04:46 47619 INFO Contents of "/data/pg_data/postgresql-auto-failover-standby.conf" have changed, overwriting10:04:46 47619 INFO bin/openssl req -new -x509 -days 365 -nodes -text -out data/pg_data/server.crt -keyout data/pg_data/server.key -subj "/CN=172.20.70.204"10:04:47 47627 INFO usr/pgsql-12/bin/postgres -D data/pg_data -p 5432 -h *10:04:47 47618 INFO Postgres is now serving PGDATA "/data/pg_data" on port 5432 with pid 4762710:04:47 47619 INFO PostgreSQL started on port 543210:04:47 47619 INFO Fetched current list of 1 other nodes from the monitor to update HBA rules, including 1 changes.10:04:47 47619 INFO Ensuring HBA rules for node 8 "node_8" (172.20.70.205:5432)10:04:47 47619 INFO Transition complete: current state is now "catchingup"10:04:47 47619 INFO keeper has been successfully initialized.10:04:47 47615 WARN pg_autoctl service node-init exited with exit status 010:04:47 47618 INFO Postgres controller service received signal SIGTERM, terminating10:04:47 47618 INFO Stopping pg_autoctl postgres service10:04:47 47618 INFO /usr/pgsql-12/bin/pg_ctl --pgdata /data/pg_data --wait stop --mode fast10:04:47 47615 INFO Stop pg_autoctl--monitor信息记录新节点join到集群[postgres@monitor pg_data]$ 10:04:46 51157 INFO Registering node 9 "node_1" (172.20.70.204:5432) to formation "default" with replication quorum false and candidate priority 50 [50]10:04:46 51157 INFO New state is reported by node 9 "node_1" (172.20.70.204:5432): "wait_standby"10:04:46 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): wait_standby ? wait_standby10:04:46 51157 INFO Setting goal state of node 8 "node_8" (172.20.70.205:5432) to wait_primary after node 9 "node_1" (172.20.70.204:5432) joined.10:04:46 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): single ? wait_primary10:04:46 51157 INFO New state is reported by node 8 "node_8" (172.20.70.205:5432): "wait_primary"10:04:46 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): wait_primary ? wait_primary10:04:46 51157 INFO Setting goal state of node 9 "node_1" (172.20.70.204:5432) to catchingup after node 8 "node_8" (172.20.70.205:5432) converged to wait_primary.10:04:46 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): wait_standby ? catchingup10:04:47 51157 INFO New state is reported by node 9 "node_1" (172.20.70.204:5432): "catchingup"10:04:47 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): catchingup ? catchingup
/usr/pgsql-12/bin/pg_autoctl run &--monitor显示新成员从库已加入集群[postgres@monitor pg_data]$ 10:04:46 51157 INFO Registering node 9 "node_1" (172.20.70.204:5432) to formation "default" with replication quorum false and candidate priority 50 [50]10:04:46 51157 INFO New state is reported by node 9 "node_1" (172.20.70.204:5432): "wait_standby"10:04:46 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): wait_standby ? wait_standby10:04:46 51157 INFO Setting goal state of node 8 "node_8" (172.20.70.205:5432) to wait_primary after node 9 "node_1" (172.20.70.204:5432) joined.10:04:46 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): single ? wait_primary10:04:46 51157 INFO New state is reported by node 8 "node_8" (172.20.70.205:5432): "wait_primary"10:04:46 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): wait_primary ? wait_primary10:04:46 51157 INFO Setting goal state of node 9 "node_1" (172.20.70.204:5432) to catchingup after node 8 "node_8" (172.20.70.205:5432) converged to wait_primary.10:04:46 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): wait_standby ? catchingup10:04:47 51157 INFO New state is reported by node 9 "node_1" (172.20.70.204:5432): "catchingup"10:04:47 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): catchingup ? catchingup10:06:23 51157 INFO Node 9 (172.20.70.204:5432) is marked as unhealthy by the monitor10:06:23 51157 INFO Node 9 (172.20.70.204:5432) is marked as healthy by the monitor10:06:23 51157 INFO Setting goal state of node 9 "node_1" (172.20.70.204:5432) to secondary after it caught up.10:06:23 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): catchingup ? secondary10:06:23 51157 INFO New state is reported by node 9 "node_1" (172.20.70.204:5432): "secondary"10:06:23 51157 INFO New state for node 9 "node_1" (172.20.70.204:5432): secondary ? secondary10:06:23 51157 INFO Setting goal state of node 8 "node_8" (172.20.70.205:5432) to primary now that at least one secondary candidate node is healthy.10:06:23 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): wait_primary ? primary10:06:23 51157 INFO New state is reported by node 8 "node_8" (172.20.70.205:5432): "primary"10:06:23 51157 INFO New state for node 8 "node_8" (172.20.70.205:5432): primary ? primary
[postgres@monitor pg_data]$ pg_autoctl show stateName | Node | Host:Port | LSN | Reachable | Current State | Assigned State-------+-------+--------------------+-----------+-----------+---------------------+--------------------node_8 | 8 | 172.20.70.205:5432 | 0/D000060 | yes | primary | primarynode_1 | 9 | 172.20.70.204:5432 | 0/D000060 | yes | secondary | secondary--原备库node8成为新集群primary主库--原主库node1(初始化环境)加入集群成为新集群secondary 从库

PART3:结束语
数据库高可用是确保业务连续的关键。同时,选择稳定可靠的HA方案以及掌握正确的维护“姿势”也是不可或缺的。如果觉得本文有所帮助或者启发,欢迎添加好友交流收藏,2024年我们一路同行!!!





文章转载自dba悠然,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




