


问:pg_rewind使用场景是什么?
答:基于流复制搭建的HA环境,若出现网络或硬件故障时,standby节点升级为master节点,但旧的master节点数据库并未损坏,在排除故障后,旧master节点无需通过备份方式重新搭建成standby节点,pg_rewind可以协助我们完成这个操作。
一、pg_rewind原理:
二、实验测试:
环境简介:WINDOWS
| 主节点 | 192.168.56.1 | D:\software\postgres13\data | pg 12.5 |
| 从节点 | 192.168.56.15 | C:\pg13\data\pg_wal | pg 12.5 |
实验模拟:备库升级为主库,旧主库继续写数据,通过pg_rewind将旧master节点切换成standby节点。
1. 从节点切换成主节点(可读写)
pg_ctl promote -D pgdata/data# 查看节点是否可读写postgres=# select pg_is_in_recovery();pg_is_in_recovery-------------------f(1 行记录)
观察pg_wal目录下生成了时间线history文件

2. 旧master节点继续写数据
模拟数据pgbench -i -Upostgrespgbench -M prepared -v -r -P 1 -c 4 -j 4 -T 120 -U postgresPassword:starting vacuum...end.starting vacuum pgbench_accounts...end.progress: 1.0 s, 1363.5 tps, lat 2.122 ms stddev 3.277progress: 2.0 s, 1290.2 tps, lat 3.098 ms stddev 0.676progress: 24.0 s, 945.0 tps, lat 4.236 ms stddev 1.339...
3. 旧master降级为standby节点
# 注意:旧master时间线是从3开始的,而新的master时间线是从4开始的# 此时,需要pg_rewind去拉齐时间线,以完成主从同步。1. wal_log_hints设置为on,重启生效2. 关闭服务器3. 执行pg_rewind拉齐时间线
# 可以用调试模式查看执行过程pg_rewind --target-pgdata=D:\software\postgres13\data --source-server="host=192.168.56.15 port=5432 user=postgres password=oracle" --progress --debug -npg_rewind: 已连接服务器pg_rewind: fetched file "global/pg_control", length 8192pg_rewind: fetched file "pg_wal/00000004.history", length 126pg_rewind: Source timeline history:pg_rewind: Target timeline history:pg_rewind: 1: 0/0 - 0/50001C0pg_rewind: 2: 0/50001C0 - 0/50128E0pg_rewind: 3: 0/50128E0 - 0/0pg_rewind: 服务器在时间线3上的WAL位置0/1AC0F638处发生了分歧pg_rewind: 从时间线3上0/1AC0F588处的最后一个普通检查点倒带pg_rewind: 读取源文件列表pg_rewind: entry "backup_manifest" excluded from source file listpg_rewind: entry "postmaster.opts" excluded from source file listpg_rewind: entry "postmaster.pid" excluded from source file listpg_rewind: entry "global/pg_internal.init" excluded from source file listpg_rewind: entry "pg_stat_tmp/db_0.stat" excluded from source file listpg_rewind: entry "pg_stat_tmp/db_13442.stat" excluded from source file listpg_rewind: entry "pg_stat_tmp/global.stat" excluded from source file listpg_rewind: entry "pg_subtrans/0000" excluded from source file listpg_rewind: entry "base/13442/pg_internal.init" excluded from source file listpg_rewind: 读取目标文件列表pg_rewind: 读取目标中的WALpg_rewind: backup_label.old (COPY)pg_rewind: base/1/1247_fsm (COPY)pg_rewind: base/1/1247_vm (COPY)pg_rewind: base/1/1249_fsm (COPY)...pg_rewind: received chunk for file "base/13442/41018", offset 1064960, size 8192已复制157175/157175 kB (100%)pg_rewind: 正在创建备份标签并且更新控制文件pg_rewind: 正在同步目标数据目录pg_rewind: 完成!# 上边是调试执行,下边是真正执行rewind操作pg_rewind --target-pgdata=D:\software\postgres13\data --source-server="host=192.168.56.15 port=5432 user=postgres password=oracle"pg_rewind: 服务器在时间线3上的WAL位置0/1AC0F638处发生了分歧pg_rewind: 从时间线3上0/1AC0F588处的最后一个普通检查点倒带pg_rewind: 完成!
4. 启动新的从节点
1. 修改postgresql.auto.conf配置文件primary_conninfo = 'user=repl password=oracle channel_binding=prefer host=192.168.56.15 port=5432 sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'2. 创建standby.signal文件3. 启动主节点
5. 确认新主从关系
# 新主库postgres=# \x扩展显示已打开.postgres=# select * from pg_stat_replication;-[ RECORD 1 ]----+------------------------------pid | 2936usesysid | 16394usename | replapplication_name | walreceiverclient_addr | 192.168.56.1client_hostname |client_port | 6943backend_start | 2022-12-02 17:44:09.989984+08backend_xmin | 47703state | streamingsent_lsn | 0/1D191C08write_lsn | 0/1D191C08flush_lsn | 0/1D191C08replay_lsn | 0/1D191C08write_lag |flush_lag |replay_lag |sync_priority | 0sync_state | asyncreply_time | 2022-12-02 17:57:56.491876+08# 新从库postgres=# \x扩展显示已打开.postgres=# select * from pg_stat_wal_receiver;-[ RECORD 1 ]---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------pid | 22012status | streamingreceive_start_lsn | 0/1D000000receive_start_tli | 4written_lsn | 0/1D191B20flushed_lsn | 0/1D191B20received_tli | 4last_msg_send_time | 2022-12-02 17:44:10.023871+08last_msg_receipt_time | 2022-12-02 17:44:11.445336+08latest_end_lsn | 0/1D191B20latest_end_time | 2022-12-02 17:44:10.000481+08slot_name |sender_host | 192.168.56.15sender_port | 5432conninfo | user=repl password=******** channel_binding=prefer dbname=replication host=192.168.56.15 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any
文章转载自小陈的技术博客,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




