作者
digoal
日期
2019-05-03
标签
PostgreSQL , replication , slot , logical , stream
背景
通过物理流复制或逻辑流复制,PG可以构建实时的standby。
物理流复制的standby可以构建级联standby,创建物理slot等。
1、物理standby支持创建物理slot
2、物理standby支持级联物理standby
但是物理流复制的STANDBY:
1、物理standby不支持逻辑复制,例如创建逻辑slot decode
2、物理standby不支持publication
3、物理standby不支持级联逻辑复制
同时不管是逻辑standby还是物理standby,都不支持slot信息的同步。
1、流复制暂时不支持slot同步(physical and logical)(所以也不支持漂移)
重点:了解这些信息后,在使用流复制的HA架构时,就需要考虑一些问题:
1、如果你在流复制HA架构中,使用了逻辑复制(replication slot, publication\subscriber, logical decoding),那么在发生主从切换后,这些信息都会丢失。
如果有这种场景使用的需求,同时有HA的需求,那么可以考虑换一下HA架构,例如共享存储的传统HA架构,数据只有一份,只有一个节点ACTIVE,另一个节点在主节点异常时接管存储,启动数据库。
例子
1、主节点数据目录
total 156K
drwx------ 6 pg12 pg12 4.0K May 3 10:50 base
-rw------- 1 pg12 pg12 29 May 3 17:36 current_logfiles
drwx------ 2 pg12 pg12 4.0K May 3 17:36 global
drwx------ 2 pg12 pg12 4.0K May 3 17:36 log
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_commit_ts
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_dynshmem
-rw------- 1 pg12 pg12 4.5K May 3 10:21 pg_hba.conf
-rw------- 1 pg12 pg12 1.6K May 3 10:21 pg_ident.conf
drwx------ 4 pg12 pg12 4.0K May 3 18:06 pg_logical
drwx------ 4 pg12 pg12 4.0K May 3 10:21 pg_multixact
drwx------ 2 pg12 pg12 4.0K May 3 17:36 pg_notify
drwx------ 4 pg12 pg12 4.0K May 3 17:41 pg_replslot
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_serial
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_snapshots
drwx------ 2 pg12 pg12 4.0K May 3 17:36 pg_stat
drwx------ 2 pg12 pg12 4.0K May 3 20:24 pg_stat_tmp
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_subtrans
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_tblspc
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_twophase
-rw------- 1 pg12 pg12 3 May 3 10:21 PG_VERSION
drwx------ 3 pg12 pg12 28K May 3 17:41 pg_wal
drwx------ 2 pg12 pg12 4.0K May 3 10:21 pg_xact
-rw------- 1 pg12 pg12 88 May 3 10:21 postgresql.auto.conf
-rw------- 1 pg12 pg12 27K May 3 17:36 postgresql.conf
-rw------- 1 pg12 pg12 32 May 3 17:36 postmaster.opts
-rw------- 1 pg12 pg12 88 May 3 17:36 postmaster.pid
2、在主节点创建物理slot
select pg_create_physical_replication_slot('phy1');
3、在主节点创建逻辑slot
select pg_create_logical_replication_slot('log1','test_decoding');
4、在主节点查看SLOT
postgres=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-----------+---------------+-----------+--------+-----------+--------+------------+------+--------------+-------------+---------------------
phy1 | | physical | | f | f | | | | |
log1 | test_decoding | logical | 13585 | f | f | | | 534 | 1/AB000110 | 1/AB000148
(2 rows)
5、在主节点,查看数据目录中的slot目录信息
pg12@pg11-test-> ll pg_replslot/
total 8.0K
drwx------ 2 pg12 pg12 4.0K May 3 17:37 log1
drwx------ 2 pg12 pg12 4.0K May 3 17:35 phy1
6、创建主节点的物理STANDBY,并查看slot目录,没有任何SLOT信息
pg12@pg11-test-> pwd
/data01/pg12/pg_root12001
pg12@pg11-test-> ll pg_replslot/
total 0
7、在物理STANDBY节点,创建逻辑slot,报错,因为物理STANDBY(处于恢复状态的数据库)不允许创建logical slot。
```
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
t
(1 row)
postgres=# \set VERBOSITY verbose
postgres=# select pg_create_logical_replication_slot('log1','test_decoding');
psql: ERROR: 0A000: logical decoding cannot be used while in recovery
LOCATION: CheckLogicalDecodingRequirements, logical.c:113
```
8、在物理STANDBY节点,创建物理SLOT,允许,这个SLOT与主节点的SLOT是两个完全独立的SLOT,各自记录各自下游节点的WAL接收状态与位点信息。
```
postgres=# select pg_create_physical_replication_slot('phy1');
pg_create_physical_replication_slot
(phy1,)
(1 row)
postgres=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-----------+--------+-----------+--------+-----------+--------+------------+------+--------------+-------------+---------------------
phy1 | | physical | | f | f | | | | |
(1 row)
```
9、在物理STANDBY节点,创建发布(publication),异常,因为物理STANDBY节点不允许创建逻辑SLOT。
postgres=# create publication pub1 ;
psql: ERROR: 25006: cannot execute CREATE PUBLICATION in a read-only transaction
LOCATION: PreventCommandIfReadOnly, utility.c:246
10、在主节点,创建publication,并将测试表放入publication
```
postgres=# create publication pub1 ;
CREATE PUBLICATION
postgres=# alter publication pub1 add table test;
ALTER PUBLICATION
postgres=# select relid::regclass from pg_get_publication_tables('pub1');
relid
test
(1 row)
postgres=# select * from pg_publication_tables;
pubname | schemaname | tablename
---------+------------+-----------
pub1 | public | test
(1 row)
```
11、在主节点的物理STANDBY节点,可以查到这个publication元数据,但是实际上这个元数据并没有用户,因为它在standby上面没有对应的SLOT。
postgres=# select * from pg_publication_tables;
pubname | schemaname | tablename
---------+------------+-----------
pub1 | public | test
(1 row)
12、在第三个实例,基于standby创建sub 订阅,失败,因为物理standby不能创建逻辑SLOT。
```
postgres=# create table test(id int primary key, info text, crt_time timestamp);
CREATE TABLE
postgres=# create subscription sub2 connection 'host=127.0.0.1 port=12001 user=postgres dbname=postgres' publication pub1;
ERROR: could not create replication slot "sub2": ERROR: logical decoding cannot be used while in recovery
postgres=# \set VERBOSITY verbose
postgres=# create subscription sub2 connection 'host=127.0.0.1 port=12001 user=postgres dbname=postgres' publication pub1;
ERROR: XX000: could not create replication slot "sub2": ERROR: logical decoding cannot be used while in recovery
LOCATION: libpqrcv_create_slot, libpqwalreceiver.c:831
```
13、在第三个实例,基于主实例创建sub 订阅,正常,因为主节点可以创建逻辑slot。
```
postgres=# create subscription sub2 connection 'host=127.0.0.1 port=12000 user=postgres dbname=postgres' publication pub1;
NOTICE: 00000: created replication slot "sub2" on publisher
LOCATION: CreateSubscription, subscriptioncmds.c:470
CREATE SUBSCRIPTION
数据已完全初始化
postgres=# select count(*) from test;
count
1000000
(1 row)
```
14、在第三个实例,将sub的连接上游信息从主节点改成物理STANDBY节点
postgres=# alter subscription sub2 connection 'host=127.0.0.1 port=12001 user=postgres dbname=postgres';
ALTER SUBSCRIPTION
15、在主节点写入新的记录,并且保证standby已复制这条记录。
```
insert into test values (0);
standby
postgres=# select count(*) from test;
count
1000001
(1 row)
```
16、在第三个实例,发现这条记录没有被订阅,原因是当前订阅连接的是物理STANDBY。
```
postgres=# select count(*) from test;
count
1000000
(1 row)
```
观察物理STANDBY的日志,报错如下
logical decoding cannot be used while in recovery
2019-05-03 21:48:07.111 CST,,,32611,"127.0.0.1:52194",5ccc4697.7f63,1,"",2019-05-03 21:48:07 CST,,0,LOG,00000,"connection received: host=127.0.0.1 port=52194",,,,,,,,"BackendInitialize, postmaster.c:4289",""
2019-05-03 21:48:07.112 CST,"postgres","postgres",32611,"127.0.0.1:52194",5ccc4697.7f63,2,"authentication",2019-05-03 21:48:07 CST,2/38,0,LOG,00000,"replication connection authorized: user=postgres application_name=sub2",,,,,,,,"PerformAuthentication, postinit.c:276",""
2019-05-03 21:48:07.113 CST,"postgres","postgres",32611,"127.0.0.1:52194",5ccc4697.7f63,3,"idle",2019-05-03 21:48:07 CST,2/0,0,ERROR,0A000,"logical decoding cannot be used while in recovery",,,,,,,,"CheckLogicalDecodingRequirements, logical.c:113","sub2"
2019-05-03 21:48:07.113 CST,"postgres","postgres",32611,"127.0.0.1:52194",5ccc4697.7f63,4,"idle",2019-05-03 21:48:07 CST,,0,LOG,00000,"disconnection: session time: 0:00:00.002 user=postgres database=postgres host=127.0.0.1 port=52194",,,,,,,,"log_disconnections, postgres.c:4679","sub2"
17、在第三个实例,将SUB的连接信息改回主节点
postgres=# alter subscription sub2 connection 'host=127.0.0.1 port=12000 user=postgres dbname=postgres';
ALTER SUBSCRIPTION
订阅到了最新记录
```
postgres=# select count(*) from test;
count
1000001
(1 row)
```
18、在第三个实例,查询sub订阅信息,SUB2为本文创建的订阅通道。
```
postgres=# select oid,* from pg_subscription;
oid | subdbid | subname | subowner | subenabled | subconninfo | subslotname | subsynccommit | subpublications
---------+---------+---------+----------+------------+------------------------------------------------------------------------------+-------------+---------------+-----------------
1932539 | 13285 | sub1 | 10 | t | hostaddr=127.0.0.1 port=1921 user=postgres dbname=postgres password=postgres | sub1 | off | {pub1}
1940851 | 13285 | sub2 | 10 | t | host=127.0.0.1 port=12000 user=postgres dbname=postgres | sub2 | off | {pub1}
(2 rows)
postgres=# select * from pg_stat_get_subscription(1940851);
subid | relid | pid | received_lsn | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time
---------+-------+-------+--------------+-------------------------------+------------------------------+----------------+-------------------------------
1940851 | | 32691 | 1/B1012438 | 2019-05-03 21:48:52.217116+08 | 2019-05-03 21:48:52.21716+08 | 1/B1012438 | 2019-05-03 21:48:52.217116+08
(1 row)
```
可以看到sub订阅信息记录了latest_end_lsn,即接收到的上游节点的REDO解析位点的逻辑日志。
19、在主节点,可以看到逻辑复制SUB2对应的slot信息
```
pg12@pg11-test-> psql
psql (12devel)
Type "help" for help.
postgres=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-----------+---------------+-----------+--------+-----------+--------+------------+------+--------------+-------------+---------------------
phy1 | | physical | | f | f | | | | |
log1 | test_decoding | logical | 13585 | f | f | | | 534 | 1/AB000110 | 1/AB000148
sub2 | pgoutput | logical | 13585 | f | t | 32692 | | 538 | 1/B1012400 | 1/B1012438
(3 rows)
```
20、在第三个实例,再次将sub定义的上游改成物理standby节点。
postgres=# alter subscription sub2 connection 'host=127.0.0.1 port=12001 user=postgres dbname=postgres';
ALTER SUBSCRIPTION
21、正常的主备切换,不丢任何数据
pg_ctl stop -m fast -D 主节点目录
激活从库
pg_ctl promote -D 从节点目录
22、激活后的节点,并没有任何SLOT信息。
```
pg12@pg11-test-> psql -h 127.0.0.1 -p 12001
psql (12devel)
Type "help" for help.
postgres=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-----------+--------+-----------+--------+-----------+--------+------------+------+--------------+-------------+---------------------
(0 rows)
```
23、在激活后的节点上,查看日志,发现报没有sub2 slot的错误。原因是流复制并没有同步SLOT信息,slot信息全部都在原来的主节点上面,在激活后的从节点上没有任何SLOT信息。
2019-05-03 21:58:12.368 CST,,,33474,"127.0.0.1:52282",5ccc48f4.82c2,1,"",2019-05-03 21:58:12 CST,,0,LOG,00000,"connection received: host=127.0.0.1 port=52282",,,,,,,,"BackendInitialize, postmaster.c:4289",""
2019-05-03 21:58:12.369 CST,"postgres","postgres",33474,"127.0.0.1:52282",5ccc48f4.82c2,2,"authentication",2019-05-03 21:58:12 CST,4/38,0,LOG,00000,"replication connection authorized: user=postgres application_name=sub2",,,,,,,,"PerformAuthentication, postinit.c:276",""
2019-05-03 21:58:12.371 CST,"postgres","postgres",33474,"127.0.0.1:52282",5ccc48f4.82c2,3,"idle",2019-05-03 21:58:12 CST,4/0,0,ERROR,42704,"replication slot ""sub2"" does not exist",,,,,,,,"ReplicationSlotAcquire, slot.c:387","sub2"
2019-05-03 21:58:12.371 CST,"postgres","postgres",33474,"127.0.0.1:52282",5ccc48f4.82c2,4,"idle",2019-05-03 21:58:12 CST,,0,LOG,00000,"disconnection: session time: 0:00:00.002 user=postgres database=postgres host=127.0.0.1 port=52282",,,,,,,,"log_disconnections, postgres.c:4679","sub2"
24、对于这种流复制的HA架构,如果使用了PG内置的逻辑订阅功能,那么发生HA切换后,需要在激活后的新主库上创建对应的SLOT。
例如
```
pg12@pg11-test-> psql -h 127.0.0.1 -p 12001
psql (12devel)
Type "help" for help.
postgres=# select pg_create_logical_replication_slot('sub2','pgoutput');
pg_create_logical_replication_slot
(sub2,1/B20059E0)
(1 row)
pg12@pg11-test-> psql -h 127.0.0.1 -p 12001
psql (12devel)
Type "help" for help.
postgres=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn
-----------+----------+-----------+--------+-----------+--------+------------+------+--------------+-------------+---------------------
sub2 | pgoutput | logical | 13585 | f | t | 34972 | | 540 | 1/B20059A8 | 1/B20059E0
(1 row)
```
新主库对应日志
2019-05-03 22:14:09.002 CST,,,34972,"127.0.0.1:52668",5ccc4cb1.889c,1,"",2019-05-03 22:14:09 CST,,0,LOG,00000,"connection received: host=127.0.0.1 port=52668",,,,,,,,"BackendInitialize, postmaster.c:4289",""
2019-05-03 22:14:09.003 CST,"postgres","postgres",34972,"127.0.0.1:52668",5ccc4cb1.889c,2,"authentication",2019-05-03 22:14:09 CST,5/434,0,LOG,00000,"replication connection authorized: user=postgres application_name=sub2",,,,,,,,"PerformAuthentication, postinit.c:276",""
2019-05-03 22:14:09.004 CST,"postgres","postgres",34972,"127.0.0.1:52668",5ccc4cb1.889c,3,"idle",2019-05-03 22:14:09 CST,5/0,0,LOG,00000,"starting logical decoding for slot ""sub2""","Streaming transactions committing after 1/B20059E0, reading WAL from 1/B20059A8.",,,,,,,"CreateDecodingContext, logical.c:447","sub2"
2019-05-03 22:14:09.004 CST,"postgres","postgres",34972,"127.0.0.1:52668",5ccc4cb1.889c,4,"idle",2019-05-03 22:14:09 CST,5/0,0,LOG,00000,"logical decoding found consistent point at 1/B20059A8","There are no running transactions.",,,,,,,"SnapBuildFindSnapshot, snapbuild.c:1272","sub2"
25、注意,由于新建的SLOT位点超出了sub2逻辑订阅的位点,所以中间没有解析的REDO对应的变更会丢失。
```
sub端要的是这里开始的REDO
1/B1012438
新主库创建的逻辑SLOT,起始位点是:
1/B20059A8
这两个位置存在差异,新主库对应位点已经更后了,所以丢了一部分信息
```
事实上可以看到values -1对应的记录没有被订阅到,因为这条记录对应的REOD是在订阅端没有接收到的。
postgres=# select * from test where id=-1;
id | info | crt_time
----+------+----------
(0 rows)
在新主库新产生的日志可以被回放。
```
postgres=# insert into test values (-2);
INSERT 0 1
订阅端
postgres=# select * from test where id=-2;
id | info | crt_time
----+------+----------
-2 | |
(1 row)
但是对于key -1的操作,不会被复制
postgres=# update test set info='abc' where id=-1;
UPDATE 1
postgres=# update test set info='abc' where id=-2;
UPDATE 1
订阅端
postgres=# select * from test where id=-1;
id | info | crt_time
----+------+----------
(0 rows)
postgres=# select * from test where id=-2;
id | info | crt_time
----+------+----------
-2 | abc |
(1 row)
```
小结
物理流复制的standby可以构建级联standby,创建物理slot等。
1、物理standby支持创建物理slot
2、物理standby支持级联物理standby
物理流复制的STANDBY不支持逻辑复制、SLOT同步、逻辑订阅功能:
1、物理standby不支持逻辑复制,例如创建逻辑slot decode
2、物理standby不支持publication
3、物理standby不支持级联逻辑复制
不管是逻辑standby还是物理standby,都不支持slot信息的同步。
1、流复制暂时不支持slot同步(physical and logical)(所以也不支持漂移)
重点:如果你在流复制HA架构中,使用了逻辑复制(replication slot, publication\subscriber, logical decoding),那么在发生主从切换后,这些信息都会丢失。
如果有这种场景使用的需求,同时有HA的需求,那么可以考虑换一下HA架构,例如共享存储的传统HA架构,数据只有一份,只有一个节点ACTIVE,另一个节点在主节点异常时接管存储,启动数据库。
参考
https://www.postgresql.org/docs/devel/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY
PostgreSQL 13 将支持slot主从同步,
https://www.postgresql.org/message-id/flat/3095349b-44d4-bf11-1b33-7eefb585d578@2ndquadrant.com
https://commitfest.postgresql.org/23/1961/
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.
9.9元购买3个月阿里云RDS PostgreSQL实例
PostgreSQL 解决方案集合
德哥 / digoal's github - 公益是一辈子的事.





