Postgresql高可用之repmgr+keepalived+流复制架构搭建

IT那活儿 2022-10-29

3176

点击上方“IT那活儿”公众号，关注后了解更多内容，不管IT什么活儿，干就完了！！！

repmgr是一个对postgresql流复制进行管理以及自动故障转移的开源软件，大大的简化了PG流复制架构的管理。但是repmgr不具备提供VIP功能，只能借助keepalived实现VIP，并确保VIP运行在primary节点上。其不具备连接池功能，所以只是一个轻量级开源软件。下面详细介绍该架构的详细搭建过程。

repmgr软件的安装

1. 先安装repmgr依赖的软件包，官方推荐使用yum安装如下组件和rpm包.

yumcheck-update
yumgroupinstall "Development Tools"
yuminstall yum-utils openjade docbook-dtds docbook-style-dsssldocbook-style-xsl
yuminstall flex libselinux-devel libxml2-devel libxslt-developenssl-devel pam-devel readline-devel

2. 下载repmgr并安装，当前的版本为5.2.1.

./configure&& make install

注意，确保pg_config在安装用户的环境变量PATH中，repmgr会默认安装到postgres的软件目录下。

PG数据库及repmgr配置

首先完成流复制相关的参数修改和用户创建，并修改pg_hba.conf文件.

createuser repluser with usperuser password '****';
createdatabase repmgr with owner='repluser';

--修改主备节点的pg_hba.conf文件.

local   replication repluser trust
host replication repluser 127.0.0.1/32            trust
host replication repluser 10.26.60.0/24          scram-sha-256

local repmgr repluser trust
host repmgr repluser 127.0.0.1/32            trust
host repmgr repluser 10.26.60.0/24          scram-sha-256

--修改.pgpass：

spcl-pg250:5432:repmgr:repluser:Repl#2021
spcl-pg249:5432:repmgr:repluser:Repl#2021

--修改/etc/repmgr.conf：

node_id=1
node_name='host01'
conninfo='host=host01user=repluser dbname=repmgr connect_timeout=2'
data_directory='/pgdata’

--注册主节点：

repmgr-f /etc/repmgr.conf primary register

--验证集群状态：

repmgr-f etc/repmgr.conf cluster show

--创建从库的/etc/repmgr.conf文件：

node_id=2
node_name='host02'
conninfo='host=host02user=repluser dbname=repmgr connect_timeout=2'
data_directory='/pgdata'

--进行从库搭建前的检测：

[postgres@host02pgdata]$ repmgr -h host01 -U repluser -d repmgr -f/pgdata/repmgr.conf standby clone --dry-run
NOTICE:destination directory "/pgdata" provided
INFO:connecting to source node
DETAIL:connection string is: host=host01 user=repluser dbname=repmgr
DETAIL:current installation size is 31 MB
INFO:"repmgr" extension is installed in database "repmgr"
INFO:replication slot usage not requested; no replication slot will beset up for this standby
INFO:parameter "max_wal_senders" set to 32
NOTICE:checking for available walsenders on the source node (2 required)
INFO:sufficient walsenders available on the source node
DETAIL:2 required, 31 available
NOTICE:checking replication connections can be made to the source server (2required)
INFO:required number of replication connections could be made to thesource server
DETAIL:2 replication connections required
WARNING:data checksums are not enabled and "wal_log_hints" is "off"
DETAIL:pg_rewind requires "wal_log_hints" to be enabled
NOTICE:standby will attach to upstream node 1
HINT:consider using the -c/--fast-checkpoint option
INFO:all prerequisites for "standby clone" are met

--执行以下命令完成从库搭建：

repmgr-h host01 -U repluser -d repmgr -f /etc/repmgr.conf standby clone

注意：如果数据量很大，则在搭建从库时需要加上--fast-checkpoint参数，否则调用pg_basebackup备份的过程会非常慢。

---如果不是使用repmgr完成的流复制搭建，则配置连接串时，需要指定application_name

repmgr-f /etc/repmgr.conf standby register
repmgr-f /etc/repmgr.conf cluster show

至此，repmgr就搭建及配置完成。

利用repmgr进行主从切换

--直接进行主从切换：

repmgrstandby switchover -f /etc/repmgr.conf --siblings-follow --dry-run --切换检查
repmgr-f  /etc/repmgr.conf standby switchover

--主库down掉后，从库手动切换成主库：

repmgr-f etc/repmgr.conf standby promote ---备节点
repmgr-f /etc/repmgr.conf standby follow --如果有多个备节点，在其他备节点执行

配置自动failover

--修改/etc/repmgr.conf：

shared_preload_libraries = 'repmgr'

monitor_interval_secs=2
connection_check_type=connection
reconnect_attempts==6
reconnect_interval=4
failover=automatic
promote_command='/usr/local/postgresql/bin/repmgr standby promote -f /etc/repmgr.conf --log-to-file'
follow_command='/usr/local/postgresql/bin/repmgr standby follow -f /etc/repmgr.conf --log-to-file --upstream-node-id=%n'
log_level='INFO'
log_facility='STDERR'
log_file='/data/pgdata/log/repmgr.log'
log_status_interval=300
repmgrd_service_start_command = 'repmgrd --daemonize=true'
repmgrd_service_stop_command = 'kill `cat /data/pgdata/repmgrd.pid`'
repmgrd_pid_file=' /pgdata/repmgrd.pid'

--启动守护进程：

repmgrdaemon start

--自动failover之后需手工将原主节点恢复成备节点，然后执行repmgr-f /etc/repmgr.conf standby register --force ，否则下次不会自动failover。

keepalived配置

keepalived在此架构中，只作为提供VIP的工具，不进行故障转移操作，所以配置相对简单，其配置如下：

/etc/keepalived/keepalived.conf

global_defs {
    router_id pg_ha 
# enable_script_security
}


vrrp_script checkpg {
    script "/etc/keepalived/scripts/checkpg.sh"
    interval 15
    fall 3
    rise 1
}
vrrp_instance VI_pgusdp {
    state BACKUP 
    interface ens160                 
    virtual_router_id 152 
    priority 80 
    advert_int 1
    nopreempt                       
    authentication { 
        auth_type PASS
        auth_pass 234235
    }
    track_script {
        checkpg
    }
    notify_master "/etc/keepalived/scripts/master.sh"
    notify_backup "/etc/keepalived/scripts/slave.sh"
    virtual_ipaddress {
        10.**.**.**/24
    }
}

/etc/keepalived/scripts/checkpg.sh

#!/bin/bash
export PGDATABASE=postgres
export PGPORT=5432
export PGUSER=postgres
export PGBIN=/usr/local/postgresql/bin
export PGDATA=/data/pgdata
LOGFILE=/etc/keepalived/log/keepalived.log
nc -w 3 localhost 5432 </dev/null
a=`echo $?`
if [ $a -eq 1 ] ;then
exit 1
else
        SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();'
        db_role=`echo $SQL1  | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGDATA -At -w`
        if [ $db_role == 't' ];then
                exit 1
        fi
fi

/etc/keepalived/scripts/master.sh

LOGFILE=/etc/keepalived/log/keepalived.log
export PGDATABASE=postgres
export PGPORT=5432
export PGUSER=postgres
export PGBIN=/usr/local/postgresql/bin
export PGDATA=/data/pgdata
LOGFILE=/etc/keepalived/log/keepalived.log
SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();'
db_role=`echo $SQL1  | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGDATA -At -w`
if [ $db_role == 't' ];then
   echo -e `date +"%F %T"` "the current database is standby DB! " >> $LOGFILE
  exit 1
else
   echo -e `date +"%F %T"` "the current database is master DB!" >> $LOGFILE
fi

/etc/keepalived/scripts/slave.sh

LOGFILE=/etc/keepalived/log/keepalived.log
export PGDATABASE=postgres
export PGPORT=5432
export PGUSER=postgres
export PGBIN=/usr/local/postgresql/bin
export PGDATA=/data/pgdata
LOGFILE=/etc/keepalived/log/keepalived.log
SQL1='SELECT pg_is_in_recovery from pg_is_in_recovery();'
 db_role=`echo $SQL1  | ${PGBIN}/psql -d $PGDATABASE -U $PGUSER -h $PGDATA -At -w`
 if [ $db_role == 't' ];then
   echo -e `date +"%F %T"` "the current database is standby DB! " >> $LOGFILE
else
   echo -e `date +"%F %T"` "the current database is master DB!" >> $LOGFILE
fi

本文作者：刘运彬（上海新炬王翦团队）

本文来源：“IT那活儿”公众号

文章转载自IT那活儿，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。