PGbackrest 远程搭建 standby 实例

原创大表哥 All China Database Union 2024-05-06

844

大家好，今天和大家分享一则利用 PGbackrest 远程搭建standby 数据库实例。

简单的说一下背景：目前状况是 primary data center 上有3节点PG实例为一主2从（1 primary + 2 standby），每个实例大小在10TB左右，
现在需要异地DC（大致半个中国的距离吧，网速巨大的瓶颈）搭建一个standby 实例。

基于我们以往数据库安装的ansible 脚本来说，如果数据库很小的话，我们采用的standby 创建方式是 repmgr 的 clone standby 的方式。


- name: "build DR standby database -- create dr standby databases"
  shell: "{{PG_HOME}}/bin/repmgr -h {{PG_VIP}} -U repmgr -p {{PORT}}   -d {{REPMGR_DB}} -f {{PG_BASE_DIR}}/{{PG_REPMGR_DIR}}/repmgr.conf standby clone -W"
  environment: 
      PGPASSWORD: "{{REPMGR_PASSWORD}}"
  when: location == 'dr'
  tags: install_dr

但是这次我们数据库的大小属于中大型大致10个TB左右，而且网络是跨DC的专线比较拉跨（每秒几兆的速度），而repmgr的复制方式是基于PG的内置build-in的
pgbackbackup的命令的包装，是不支持并行 parallel.

正好我们的备份命令是基于pgbackrest, 这个备份工具是支持并行的，我们需要尝试一下通过pgbackrest 来实现远程数据库stanby的搭建。

架构如图：

我们再去PGbackrest的官网上看看对搭建standby的支持： https://pgbackrest.org/user-guide-rhel.html#replication/hot-standby

我们可以看到pgbackrest 提供了 --type=standby 的restore 方式

有了方法论的支持，我们下面实际的来操作一下（由于10TB的数据是生产数据库的信息，这里采用一个测试环境）

主库信息：10.25.15.85

postgres=# select * from pg_stat_replication;
  pid  | usesysid | usename | application_name | client_addr  | client_hostname | client_port |         backend_start         | backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn
| replay_lsn |    write_lag    |    flush_lag    |   replay_lag    | sync_priority | sync_state |          reply_time           
-------+----------+---------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------
+------------+-----------------+-----------------+-----------------+---------------+------------+-------------------------------

  2931 |    16384 | repmgr  | pgdr84           | 10.25.15.84  |                 |       46660 | 2024-04-26 16:45:43.505828+08 |              | streaming | 0/E70F770 | 0/E70F770 | 0/E70F770
| 0/E70F770  | 00:00:00.0002   | 00:00:00.000955 | 00:00:00.001    |             1 | quorum     | 2024-04-29 15:30:07.199944+08
  8426 |    16384 | repmgr  | pgdr83           | 10.25.15.83  |                 |       51030 | 2024-04-26 16:54:11.405751+08 |              | streaming | 0/E70F770 | 0/E70F770 | 0/E70F770
| 0/E70F770  | 00:00:00.000105 | 00:00:00.000754 | 00:00:00.000817 |             1 | quorum     | 2024-04-29 15:30:07.199763+08
(3 rows)

现在做一个远程的备份：备份到171上

需要配置171的pgbackrest 文件设置：（类似于添加远程仓库信息：pg1-host=whdrcsrv404.cn.nonprod，还原的时候需要把这行去掉）
whdrcsrv404.cn.nonprod 这台机器是我们pgbackrest的仓库机器，同时也是remote standby的机器

[prod6600]
pg1-host=whdrcsrv404.cn.nonprod
pg1-path=/data/prod6600/data
pg1-port=6600
pg1-socket-path=/tmp

主库85上配置pgbackrest.conf 文件设置
原先的本地备份仓库的路径 #repo1-path=/backup 一定要注释掉
repo1-host=ljzdccapp007.cn.infra 修改为我们备份的远程仓库server, 同时也是我们未来standby database节点的server

repo1-host=ljzdccapp007.cn.infra
#repo1-path=/backup


[prod6600]
pg1-path=/data/prod6600/data
pg1-port=6600
pg1-socket-path=/tmp

主库85修改参数 archive_command

postgres=# alter system set archive_command = 'pgbackrest  --stanza=prod6600  archive-push %p';
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

在主库85和从库（备份仓库）171上分别执行命令：

pgbackrest --stanza=prod6600  --log-level-console=info stanza-create
pgbackrest --stanza=prod6600 --log-level-console=info check

我们在从库（备份仓库）171 执行远程备份命令：

nohup /usr/bin/pgbackrest --stanza=prod6600 --log-level-console=debug backup --type=full > backup_full.log &

备份过程中我们可以看见是多进程在并行备份的

是因为我们在备份中设置了参数：

# parallelize backup
process-max=4

备份完成后，由于我们设置了压缩参数： compress-type=lz4
全备份集的大小下降到了 3+TB左右 (database size: 10584.7GB => backup size: 3581.3GB)
10TB的数据在16个平行之下，大致需要 2024-05-01 21:05:47+08 - 2024-04-30 08:56:19+08 = 37个小时左右

PROD [postgres@tjdcpsql003 ~]# pgbackrest --stanza=prod5007 info
stanza: prod5007
    status: ok
    cipher: none

    db (current)
        wal archive min/max (16): 0000000E0000331C00000069/0000000E0000331C0000007D

        full backup: 20240429-104355F
            timestamp start/stop: 2024-04-30 08:56:19+08 / 2024-05-01 21:05:47+08
            wal start/stop: 0000000E0000331C00000070 / 0000000E0000331C00000073
            database size: 10584.7GB, database backup size: 10584.7GB
            repo1: backup set size: 3581.3GB, backup size: 3581.3GB

备份完成后，我们可以去还原standby 数据库了，
16个并行恢复还原大致需要： restore command end: completed successfully (5550929ms) 大致1.6个小时

pgbackrest --stanza=prod6600 --type=standby restore

2024-05-03 23:55:11.277 P00  DEBUG:     storage/storage::storageMove: => void
2024-05-03 23:55:11.277 P00 DETAIL: sync path '/data/prod5007/data/global'
2024-05-03 23:55:11.277 P00  DEBUG:     storage/storage::storagePathSync: (this: {type: posix, path: /data/prod5007/data, write: true}, pathExp: {"global"})
2024-05-03 23:55:11.278 P00  DEBUG:     storage/storage::storagePathSync: => void
2024-05-03 23:55:11.278 P00   INFO: restore size = 10584.7GB, file total = 43390
2024-05-03 23:55:11.281 P00  DEBUG:     command/restore/restore::cmdRestore: => void
2024-05-03 23:55:11.281 P00  DEBUG:     command/exit::exitSafe: (result: 0, error: false, signalType: 0)
2024-05-03 23:55:11.282 P00   INFO: restore command end: completed successfully (5550929ms)
2024-05-03 23:55:11.282 P00  DEBUG:     command/exit::exitSafe: => 0
2024-05-03 23:55:11.282 P00  DEBUG:     main::main: => 0

如果恢复报错的话：ERROR: [072]: restore command must be run on the PostgreSQL host

注释掉： #pg1-host=whdrcsrv404.cn.nonprod （因为只能在本地还原）

[prod6600]
#pg1-host=whdrcsrv404.cn.nonprod
pg1-path=/data/prod6600/data
pg1-port=6600
pg1-socket-path=/tmp

我们查看这个还原的数据库的 postgresql.auto.conf

我们可以看到pgbackrest 自动添加了restore_command ：

#Recovery settings generated by pgBackRest restore on 2024-04-29 17:00:26
restore_command = ‘pgbackrest --stanza=prod6600 archive-get %f “%p”’

INFRA [postgres@ljzdccapp007 data]# view postgresql.auto.conf


# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
synchronous_standby_names = 'ANY 1 (pgdr83,pgdr84)'
primary_conninfo = 'user=repmgr connect_timeout=2 host=10.25.15.83 port=6600 application_name=pgdr85'
wal_retrieve_retry_interval = '5000'
archive_command = 'pgbackrest  --stanza=prod6600  archive-push %p'


# Recovery settings generated by pgBackRest restore on 2024-04-29 17:00:26
restore_command = 'pgbackrest --stanza=prod6600 archive-get %f "%p"'

我们需要做一些修改：

删除同步参数： synchronous_standby_names = 'ANY 1 (pgdr83,pgdr84)'

primary_conninfo = 'user=repmgr connect_timeout=2 host=10.25.15.83 port=6600 application_name=pgdr85' 
修改为:  application_name=pg171

我们启动数据库：

/opt/pgsql-16/bin/pg_ctl start  -D /data/prod6600/data

查看数据库日志：

2024-04-29 17:06:15.418 P00   INFO: archive-get command begin 2.50: [000000040000000000000011, pg_wal/RECOVERYXLOG] --archive-async --buffer-size=8MiB --exec-id=86762-42ef2b44 --log-level-console=info --pg1-path=/data/prod6600/data --process-max=2 --repo1-cipher-type=none --repo1-path=/data/backup --spool-path=/var/spool/pgbackrest --stanza=prod6600
2024-04-29 17:06:15.520 P00   INFO: unable to find 000000040000000000000011 in the archive asynchronously
2024-04-29 17:06:15.520 P00   INFO: archive-get command end: completed successfully (103ms)
2024-04-29 17:06:15.524 CST [86754] LOG:  consistent recovery state reached at 0/10002428
2024-04-29 17:06:15.524 CST [86746] LOG:  database system is ready to accept read-only connections
2024-04-29 17:06:15.532 P00   INFO: archive-get command begin 2.50: [000000040000000000000011, pg_wal/RECOVERYXLOG] --archive-async --buffer-size=8MiB --exec-id=86765-f637e471 --log-level-console=info --pg1-path=/data/prod6600/data --process-max=2 --repo1-cipher-type=none --repo1-path=/data/backup --spool-path=/var/spool/pgbackrest --stanza=prod6600
2024-04-29 17:06:15.634 P00   INFO: unable to find 000000040000000000000011 in the archive asynchronously
2024-04-29 17:06:15.635 P00   INFO: archive-get command end: completed successfully (104ms)
2024-04-29 17:06:15.735 CST [86768] LOG:  fetching timeline history file for timeline 4 from primary server
2024-04-29 17:06:15.783 CST [86768] LOG:  started streaming WAL from primary at 0/11000000 on timeline 4

为了数据库将来的主从切换，我们注释掉： vi postgresql.auto.conf

#restore_command = 'pgbackrest --stanza=prod6600 archive-get %f "%p"

重启数据库:

/opt/pgsql-16/bin/pg_ctl restart  -D /data/prod6600/data

最后观察主从状态：

postgres=# select * from pg_stat_replication;
  pid  | usesysid | usename | application_name | client_addr  | client_hostname | client_port |         backend_start         | backend_xmin |   state   |  sent_lsn  | write_lsn  | flush_ls
n  | replay_lsn |    write_lag    |    flush_lag    |   replay_lag    | sync_priority | sync_state |          reply_time           
-------+----------+---------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+------------+------------+---------
---+------------+-----------------+-----------------+-----------------+---------------+------------+-------------------------------
14915 |    16384 | repmgr  | pref170          | 10.67.38.170 |                 |       44162 | 2024-04-26 17:03:03.832707+08 |              | streaming | 0/110C1350 | 0/110C1350 | 0/110C13
50 | 0/110C1350 | 00:00:00.040213 | 00:00:00.04075  | 00:00:00.040803 |             0 | async      | 2024-04-29 17:12:53.94536+08
16562 |    16384 | repmgr  | pg171            | 10.67.38.171 |                 |       48006 | 2024-04-29 17:12:16.922485+08 |              | streaming | 0/110C1350 | 0/110C1350 | 0/110C13
50 | 0/110C1350 | 00:00:00.040164 | 00:00:00.040611 | 00:00:00.040667 |             0 | async      | 2024-04-29 17:12:53.945272+08
  2931 |    16384 | repmgr  | pgdr84           | 10.25.15.84  |                 |       46660 | 2024-04-26 16:45:43.505828+08 |              | streaming | 0/110C1350 | 0/110C1350 | 0/110C13
50 | 0/110C1350 | 00:00:00.000154 | 00:00:00.001009 | 00:00:00.001129 |             1 | quorum     | 2024-04-29 17:12:53.915553+08
  8426 |    16384 | repmgr  | pgdr83           | 10.25.15.83  |                 |       51030 | 2024-04-26 16:54:11.405751+08 |              | streaming | 0/110C1350 | 0/110C1350 | 0/110C13
50 | 0/110C1350 | 00:00:00.000168 | 00:00:00.0007   | 00:00:00.000784 |             1 | quorum     | 2024-04-29 17:12:53.915222+08
(4 rows)

最后我们总结一下：

1）异地DC采取pgbackrest 创建standby原因有2点：
a) 支持parallel, 支持压缩
b) 如果备份中的网络断了，pgbackrest支持断点续传的功能，大大的节省了重做的时间

2)如果异地DC之间网络不稳定，pgbackrest 会报错： Backup command end: terminated on signal [SIGHUP]
解决方式如下： https://github.com/pgbackrest/pgbackrest/issues/2342 （亲自提了一个issue,作者的回复速度很快，给作者大大点赞！）

3）对于异地DC网络总大小是900MB带宽，开启16个进程压缩备份10TB的数据大致需要37个小时，备份压缩后总大小为 3TB，本地还原数据库的时间为1.5个小时。

Have a fun 😃 !

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

PGbackrest 远程搭建 standby 实例

评论