背景
流复制起源
PostgreSQL 自从2010年推出的9.0版本开始,支持流式物理复制,用户可以通过流式复制,构建只读备库(主备物理复制,块级别一致)。流式物理复制可以做到极低的延迟(通常在1毫秒以内)。
同步流复制
2011年推出的9.1版本,支持同步复制,当时只支持一个同步流复制备节点(例如配置了3个备,只有一个是同步模式的,其他都是异步模式)。
在同步复制模式下,当用户提交事务时,需要等待这笔事务的WAL日志复制到同步流复制备节点,才会返回提交成功的ACK给客户端。
同步模式下,可以确保数据的0丢失。(只要客户端收到了事务提交成功的ACK,这笔事务的WAL就有两份。)
级联流复制
2012年推出的9.2版本,支持级联流复制。意思是备库还可以再连备库。
级联复制特别适合跨机房的使用,例如主库在A机房,备库在B机房,但是B机房需要建立多个备库时,那么B机房只需要建立一个直连主库的备库,其他的备库可以通过B机房的这个备库级联产生。从而减少网络开销。
流复制模拟实验
1、环境准备
三台服务器:
server0:192.168.56.20
server1:192.168.56.21
server2:192.168.56.22
pg版本10.3
rpm包安装,server0,server1,server2
rpm
-ivh postgresql10-libs-10.3-1PGDG.rhel6.x86_64.rpm
rpm -ivh
postgresql10-10.3-1PGDG.rhel6.x86_64.rpm
postgresql10-contrib-10.3-1PGDG.rhel6.x86_64.rpm --nodeps --force
rpm -ivh
postgresql10-server-10.3-1PGDG.rhel6.x86_64.rpm --nodeps --force
1.1设置环境变量
export PGHOME=/usr/pgsql-10
export
PGDATA=/var/lib/pgsql/10/data
export
PATH=$PGHOME/bin/:$PATH:$HOME/bin
pg_ctl start -D var/lib/pgsql/9.6/data
1.2、初始化数据库省略
2、检查主库近程结构--此时无wal日志发送进程
-bash-4.1$ ps -axf | grep postgres
Warning: bad syntax, perhaps a bogus
'-'? See usr/share/doc/procps-3.2.8/FAQ
7525 pts/2 S 0:00
\_ su - postgres
8279 pts/2 S+ 0:00 \_ grep
postgres
8256 pts/2 S 0:00 usr/pgsql-10/bin/postgres
8257
? Ss 0:00 \_ postgres: logger process
8259 ? Ss
0:00 \_ postgres: checkpointer process
8260 ? Ss 0:00 \_
postgres: writer process
8261 ? Ss 0:00 \_ postgres: wal
writer process
8262 ? Ss 0:00 \_ postgres: autovacuum
launcher process
8263 ? Ss 0:00 \_ postgres: stats collector
process
8264 ? Ss 0:00 \_ postgres: bgworker: logical
replication launcher
3、主库配置,192.168.56.20
3.1主库postgres.conf配置
wal_level = replica # 9.6版本以前的archive和hot_standby
--该级别支持wal归档和复制。
max_wal_senders = 32
#这个设置了可以最多有几个流复制连接,差不多有几个从,就设置几个
wal_keep_segments = 256
#设置流复制保留的最多的xlog数目
wal_sender_timeout = 60s #
设置流复制主机发送数据的超时时间
synchronous_standby_names = '*'
#指定同步流复制的备库
max_connections = 100 #
这个设置要注意下,从库的max_connections必须要大于主库的
max_replication_slots = 5 #
pg9.4以后的特性
3.2、修改pg_hba.conf,增加replica用户,进行同步。
host replication postgres 192.168.56.0/24
md5
3.3、重启主库,检查wal日志传送进程
pg_ctl restart
4、备库操作
192.168.56.21
192.168.56.22
4.1、使用pg_basebackup建备库
pg_basebackup -F p --progress -D
/var/lib/pgsql/10/data -h 192.168.56.20 -p 5432 -X stream -U postgres --password
-l postgresbackup201800816
4.2、修改参数配置
进入到/data/pgsql/data2目录,复制recovery.conf,这个文件可以从pg的安装目录的share文件夹中获取,比如
standby_mode
= on # 这个说明这台机器为从库
primary_conninfo = 'host=192.168.56.20 port=5432
user=postgres password=oracle' # 这个说明这台机器对应主库的信息
recovery_target_timeline =
'latest' # 这个说明这个流复制同步到最新的数据
postgresql.conf中也有几个地方要进行修改
max_connections = 200 # 一般查多于写的应用从库的最大连接数要比较大
hot_standby =
on # 说明这台机器不仅仅是用于数据归档,也用于数据查询
max_standby_streaming_delay =
30s # 数据流备份的最大延迟时间
wal_receiver_status_interval = 10s #
多久向主报告一次从的状态,当然从每次数据复制都会向主报告状态,这里只是设置最长的间隔时间
hot_standby_feedback =
on # 如果有错误的数据复制,是否向主进行反馈
--启动备库
pg_ctl start -D var/lib/pgsql/10/data
4.3、同样模式搭建第二备库
192.168.56.22
pg_basebackup -F p --progress -D
/var/lib/pgsql/10/data -h 192.168.56.20 -p 5432 -X stream -U postgres --password
-l postgresbackup201800816
5、流复制状态检查
5.1、【主库】查看配置结果
postgres=# select * from
pg_stat_replication ;
-[ RECORD 1
]----+------------------------------
pid |
8334
usesysid | 10
usename | postgres
application_name
| walreceiver
client_addr | 192.168.56.22
client_hostname |
client_port | 42392
backend_start | 2018-08-16
14:38:35.103468+08
backend_xmin | 558
state |
streaming
sent_lsn | 0/B000140
write_lsn |
0/B000140
flush_lsn | 0/B000140
replay_lsn |
0/B000140
write_lag |
flush_lag |
replay_lag |
sync_priority | 1
sync_state | potential---可转为同步流复制
-[ RECORD
2 ]----+------------------------------
pid |
8308
usesysid | 10
usename | postgres
application_name
| walreceiver
client_addr | 192.168.56.21
client_hostname |
client_port | 41899
backend_start | 2018-08-16
14:31:20.491701+08
backend_xmin | 558
state |
streaming
sent_lsn | 0/B000140
write_lsn |
0/B000140
flush_lsn | 0/B000140
replay_lsn |
0/B000140
write_lag |
flush_lag |
replay_lag |
sync_priority | 1
sync_state | sync--同步流复制
5.2、主库进程信息
-bash-4.1$ ps -axf | grep postgres
Warning: bad syntax,
perhaps a bogus '-'? See usr/share/doc/procps-3.2.8/FAQ
7525 pts/2
S 0:00 \_ su - postgres
8310 pts/2 S+
0:00 \_ grep postgres
8256 pts/2 S 0:00
/usr/pgsql-10/bin/postgres
8257 ? Ss 0:00 \_ postgres: logger
process
8259 ? Ss 0:00 \_ postgres: checkpointer process
8260 ? Ss 0:00 \_ postgres: writer process
8261
? Ss 0:00 \_ postgres: wal writer process
8262 ?
Ss 0:00 \_ postgres: autovacuum launcher process
8263 ?
Ss 0:00 \_ postgres: stats collector process
8264 ? Ss
0:00 \_ postgres: bgworker: logical replication launcher
8308 ?
Ss 0:00 \_ postgres: wal sender process postgres 192.168.56.21(41899)
streaming 0/9000060--出现wal发送进程
8334 ? Ss 0:00 \_ postgres: wal
sender process postgres 192.168.56.22(42392) streaming
0/B000060--出现wal发送进程
5.3、备库进程信息
192.168.56.21
-bash-4.1$ ps -axf |
grep postgres
Warning: bad syntax, perhaps a bogus '-'? See
/usr/share/doc/procps-3.2.8/FAQ
13036 pts/1 S 0:00 | \_ su -
postgres
13382 pts/1 S+ 0:00 | \_ grep
postgres
13368 pts/1 S 0:00 usr/pgsql-10/bin/postgres
13369
? Ss 0:00 \_ postgres: logger process
13370 ? Ss
0:00 \_ postgres: startup process recovering
000000010000000000000009
13371 ? Ss 0:00 \_ postgres:
checkpointer process
13372 ? Ss 0:00 \_ postgres: writer
process
13373 ? Ss 0:00 \_ postgres: stats collector
process
13374 ? Ss 0:00 \_ postgres: wal receiver process
streaming 0/9000140
第二备库192.168.56.22
-bash-4.1$ ps -axf | grep postgres
Warning: bad
syntax, perhaps a bogus '-'? See usr/share/doc/procps-3.2.8/FAQ
4182
pts/1 S 0:00 \_ su - postgres
4581 pts/1 S+
0:00 \_ grep postgres
4571 pts/1 S 0:00
/usr/pgsql-10/bin/postgres -D var/lib/pgsql/10/data
4572 ? Ss
0:00 \_ postgres: logger process
4573 ?
Ss 0:00 \_ postgres: startup process recovering
00000001000000000000000B
4574 ? Ss 0:00 \_ postgres:
checkpointer process
4575 ? Ss 0:00 \_
postgres: writer process
4576 ? Ss
0:00 \_ postgres: stats collector process
4577 ?
Ss 0:00 \_ postgres: wal receiver process streaming 0/B000060
6、实现异步流复制
注释该参数,实现异步流复制
#synchronous_standby_names = '*'
#指定流复制的模式
使该参数生效
pg_ctl reload
postgres=# select * from pg_stat_replication;
-[ RECORD 1
]----+------------------------------
pid |
8334
usesysid | 10
usename | postgres
application_name
| walreceiver
client_addr | 192.168.56.22
client_hostname |
client_port | 42392
backend_start | 2018-08-16
14:38:35.103468+08
backend_xmin | 558
state |
streaming
sent_lsn | 0/B000140
write_lsn |
0/B000140
flush_lsn | 0/B000140
replay_lsn |
0/B000140
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async--异步流复制
-[ RECORD 2
]----+------------------------------
pid |
8308
usesysid | 10
usename | postgres
application_name
| walreceiver
client_addr | 192.168.56.21
client_hostname |
client_port | 41899
backend_start | 2018-08-16
14:31:20.491701+08
backend_xmin | 558
state |
streaming
sent_lsn | 0/B000140
write_lsn |
0/B000140
flush_lsn | 0/B000140
replay_lsn |
0/B000140
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async----异步流复制
7、实现级联流复制
7.1配置参数
192.168.56.20(主库)>192.168.56.21(第一备库)>192.168.56.22(级联备库)
---修改recovery.conf--primary_conninfo信息,指定主库信息
指定主库为192.168.56.21
primary_conninfo
= 'host=192.168.56.21 port=5432 user=postgres password=oracle' #
这个说明这台机器对应主库的信息
--重启备库192.168.56.22
pg_ctl restart
7.2
检查级联备库的情况
--第一备库的情况
postgres=# select * from pg_stat_replication;
-[
RECORD 1 ]----+------------------------------
pid |
8308
usesysid | 10
usename | postgres
application_name
| walreceiver
client_addr | 192.168.56.21
client_hostname |
client_port | 41899
backend_start | 2018-08-16
14:31:20.491701+08
backend_xmin | 558
state |
streaming
sent_lsn | 0/B000140
write_lsn |
0/B000140
flush_lsn | 0/B000140
replay_lsn |
0/B000140
write_lag |
flush_lag |
replay_lag |
sync_priority | 0
sync_state | async
第二备库的情况
postgres=# select * from pg_stat_replication;
-[ RECORD 1
]----+------------------------------
pid |
13446
usesysid | 10
usename |
postgres
application_name | walreceiver
client_addr |
192.168.56.22
client_hostname |
client_port |
48650
backend_start | 2018-05-21 15:19:06.419504+08
backend_xmin |
558
state | streaming
sent_lsn |
0/B000140
write_lsn | 0/B000140
flush_lsn |
0/B000140
replay_lsn | 0/B000140
write_lag |
flush_lag |
replay_lag |
sync_priority |
0
sync_state | async




