概述
PG当前发行版本中,逻辑复制槽不会同步到备库,需要备份复制槽信息,当主库节点主机暂时不可访问时,可在新主库创建复制槽。
备份方案: 使用pg_read_binary_file读取复制槽文件state,并插入到数据库表中,需要使用时再copy导出到文件
测试过程
首先创建两个复制槽
postgres=# select pg_create_logical_replication_slot('test_slot','pgoutput');
pg_create_logical_replication_slot
------------------------------------
(test_slot,0/81019F8)
(1 row)
postgres=# select pg_create_logical_replication_slot('test_slot1','pgoutput');
pg_create_logical_replication_slot
------------------------------------
(test_slot1,0/8101A68)
(1 row)
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status
| safe_wal_size | two_phase
------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+-----------
-+---------------+-----------
test_slot | pgoutput | logical | 5 | postgres | f | f | | | 35404 | 0/81019C0 | 0/81019F8 | reserved
| | f
test_slot1 | pgoutput | logical | 5 | postgres | f | f | | | 35404 | 0/8101A30 | 0/8101A68 | reserved
| | f
(2 rows)
使用备份函数备份复制槽二进制文件,函数脚本在结尾(可以设置定时任务,定时备份复制槽二进制文件到数据库中)
postgres=# select public.f_backup_pg_replication_slot_logical();
NOTICE: insert into backup_pg_replication_slot_logical (slot_name,state_file) select 'test_slot',* from pg_read_binary_file('pg_replslot/test_slot/state')
NOTICE: insert into backup_pg_replication_slot_logical (slot_name,state_file) select 'test_slot1',* from pg_read_binary_file('pg_replslot/test_slot1/state')
f_backup_pg_replication_slot_logical
--------------------------------------
2
(1 row)
postgres=# select * from backup_pg_replication_slot_logical ;
slot_name | time |
state_file
------------+----------------------------+---------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------
test_slot | 2024-04-16 15:00:10.436187 | \xa11c050140edacd802000000b8000000746573745f736c6f740000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000500000000000000000000004c8a0000c0191008000000000000000000000000f81910080000000000000000000000000070676f757470757
4000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
test_slot1 | 2024-04-16 15:00:10.436187 | \xa11c050150626f9e02000000b8000000746573745f736c6f743100000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000500000000000000000000004c8a0000301a1008000000000000000000000000681a10080000000000000000000000000070676f757470757
4000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
(2 rows)
使用导出函数,将复制槽二进制文件导出,函数脚本在结尾(需要使用复制槽二进制文件时,用该函数导出)
postgres=# select public.f_copy_pg_replication_slot_logical();
NOTICE: copy (select state_file from public.backup_pg_replication_slot_logical where slot_name='test_slot') to '/tmp/test_slot.state' (format binary)
NOTICE: copy (select state_file from public.backup_pg_replication_slot_logical where slot_name='test_slot1') to '/tmp/test_slot1.state' (format binary)
f_copy_pg_replication_slot_logical
------------------------------------
t
(1 row)
[postgres@mydb1a tmp]$ ls -ltr *.state
-rw-r--r--. 1 postgres postgres 227 4月 16 15:00 test_slot.state
-rw-r--r--. 1 postgres postgres 227 4月 16 15:00 test_slot1.state
删除复制槽,删除前pg_replslot目录存在两个复制槽同名的目录
[postgres@mydb1a tmp]$ ls -ltr /app/pg15/data/pg_replslot/
总用量 8
drwx------. 2 postgres postgres 4096 4月 16 14:59 test_slot
drwx------. 2 postgres postgres 4096 4月 16 14:59 test_slot1
[postgres@mydb1a tmp]$ md5sum /app/pg15/data/pg_replslot/test_slot/state
8a8cdd6bf5f2a103505cc55b4b320ceb /app/pg15/data/pg_replslot/test_slot/state
[postgres@mydb1a tmp]$ psql
postgres=# select pg_drop_replication_slot('test_slot1');
pg_drop_replication_slot
--------------------------
(1 row)
postgres=# select pg_drop_replication_slot('test_slot');
pg_drop_replication_slot
--------------------------
(1 row)
postgres=# \q
[postgres@mydb1a tmp]$ ls -ltr /app/pg15/data/pg_replslot/
总用量 0
[postgres@mydb1a tmp]
使用dd命令删除文件头(25)和文件尾(2),删除前文件大小227,删除后文件大小200,且修改后文件与原文件MD5值相同
[postgres@mydb1a tmp]$ ls -ltr *.state
-rw-r--r--. 1 postgres postgres 227 4月 16 15:00 test_slot.state
-rw-r--r--. 1 postgres postgres 227 4月 16 15:00 test_slot1.state
[postgres@mydb1a tmp]$ dd if=test_slot.state of=test_slot.state.temp bs=1 skip=25
记录了202+0 的读入
记录了202+0 的写出
202 bytes copied, 0.0001899 s, 1.1 MB/s
[postgres@mydb1a tmp]$ dd if=test_slot.state.temp of=test_slot.state bs=1 count=$(($(wc -c <test_slot.state.temp) -2))
记录了200+0 的读入
记录了200+0 的写出
200 bytes copied, 0.00017754 s, 1.1 MB/s
[postgres@mydb1a tmp]$ dd if=test_slot1.state of=test_slot1.state.temp bs=1 skip=25
记录了202+0 的读入
记录了202+0 的写出
202 bytes copied, 0.000203277 s, 994 kB/s
[postgres@mydb1a tmp]$ dd if=test_slot1.state.temp of=test_slot1.state bs=1 count=$(($(wc -c <test_slot1.state.temp) -2))
记录了200+0 的读入
记录了200+0 的写出
200 bytes copied, 0.00016606 s, 1.2 MB/s
[postgres@mydb1a tmp]$ ls -ltr *.state
-rw-r--r--. 1 postgres postgres 200 4月 16 15:01 test_slot.state
-rw-r--r--. 1 postgres postgres 200 4月 16 15:02 test_slot1.state
[postgres@mydb1a tmp]$ md5sum test_slot.state
8a8cdd6bf5f2a103505cc55b4b320ceb test_slot.state
还原复制槽二进制文件,重启数据库检查复制槽恢复
[postgres@mydb1a tmp]$ mkdir -p /app/pg15/data/pg_replslot/test_slot
[postgres@mydb1a tmp]$ mkdir -p /app/pg15/data/pg_replslot/test_slot1
[postgres@mydb1a tmp]$ ls -ltr /app/pg15/data/pg_replslot/
总用量 8
drwx------. 2 postgres postgres 4096 4月 16 15:03 test_slot
drwx------. 2 postgres postgres 4096 4月 16 15:03 test_slot1
[postgres@mydb1a tmp]$ cp test_slot.state /app/pg15/data/pg_replslot/test_slot/state
[postgres@mydb1a tmp]$ cp test_slot1.state /app/pg15/data/pg_replslot/test_slot1/state
[postgres@mydb1a tmp]$ /app/pg15/bin/pg_ctl -D /app/pg15/data restart
waiting for server to shut down.... done
server stopped
waiting for server to start....2024-04-16 15:03:44.314 CST [1783133] LOG: 00000: redirecting log output to logging collector process
2024-04-16 15:03:44.314 CST [1783133] HINT: Future log output will appear in directory "pg_log".
2024-04-16 15:03:44.314 CST [1783133] LOCATION: SysLogger_Start, syslogger.c:715
done
server started
[postgres@mydb1a tmp]$ psql
psql (16.2, server 15.0)
Type "help" for help.
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status
| safe_wal_size | two_phase
------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+-----------
-+---------------+-----------
test_slot | pgoutput | logical | 5 | postgres | f | f | | | 35404 | 0/81019C0 | 0/81019F8 | reserved
| | f
test_slot1 | pgoutput | logical | 5 | postgres | f | f | | | 35404 | 0/8101A30 | 0/8101A68 | reserved
| | f
(2 rows)
至此,复制槽还原完成
总结一下:
- 平时备份时,可以定时执行备份脚本,把复制槽信息备份到数据库中
- 本例使用单节点实例测试,一主一备实例主备切换后,在新主节点还原过程理论上是一样的
- 关于使用dd删除copy导出文件的文件头(25)和文件尾(2),这个25和2是怎么来的暂时没深挖,验证了pg12.7、pg15.0、pg16.2,copy二进制导出时,文件头和尾均是该长度
函数脚本
f_backup_pg_replication_slot_logical
CREATE OR REPLACE FUNCTION public.f_backup_pg_replication_slot_logical()
RETURNS int
LANGUAGE plpgsql
AS $function$
DECLARE
slot name;
sql varchar;
BEGIN
begin
truncate backup_pg_replication_slot_logical;
for slot in select slot_name from pg_replication_slots where slot_type='logical'
loop
sql := 'insert into backup_pg_replication_slot_logical (slot_name,state_file) select '''||slot||''',* from pg_read_binary_file(''pg_replslot/'||slot||'/state'')';
raise notice '%',sql;
execute sql;
end loop;
exception when others then
raise exception '%,%',sqlstate,sqlerrm;
return 0;
end;
return (select count(*) from backup_pg_replication_slot_logical);
end;
$function$;
f_copy_pg_replication_slot_logical
CREATE OR REPLACE FUNCTION public.f_copy_pg_replication_slot_logical()
RETURNS bool
LANGUAGE plpgsql
AS $function$
DECLARE
slot name;
sql varchar;
BEGIN
begin
for slot in select slot_name from public.backup_pg_replication_slot_logical
loop
sql := 'copy (select state_file from public.backup_pg_replication_slot_logical where slot_name='''||slot||''') to ''/tmp/'||slot||'.state'' (format binary)';
raise notice '%',sql;
execute sql;
end loop;
exception when others then
raise exception '%,%',sqlstate,sqlerrm;
return false;
end;
return true;
end;
$function$;
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




