0. 概述
主要检查一下三方面:
- 容灾状态
- 集群状态
- 流复制状态
1. 常用指令
gs_sdr -t query
gs_om -t status --detail
gs_ctl query
select * from pg_stat_replication;
select * from pg_replication_slots;
cm_ctl query -Cvidp
gs_ctl status
2. 未建立容灾关系时
- hadr_cluster_stat 为 normal:
[omm@panweidb01 ~]$ gs_sdr -t query
--------------------------------------------------------------------------------
Streaming disaster recovery query cdf29f001da511efbb2f000c29a96163
--------------------------------------------------------------------------------
Start streaming disaster query.
Start check archive.
Start check recovery.
Start check RPO & RTO.
Successfully executed streaming disaster recovery query, result:
{'hadr_cluster_stat': 'normal', 'hadr_failover_stat': '', 'hadr_switchover_stat': '', 'RPO': '', 'RTO': ''}
3. 建立容灾关系后的正常状态
3.1 主集群主节点
3.1.1 容灾状态
- hadr_cluster_stat 为 archive:
[omm@apanweidb01 ~]$ gs_sdr -t query
--------------------------------------------------------------------------------
Streaming disaster recovery query d6b666261da511efa53c000c29fd99eb
--------------------------------------------------------------------------------
Start streaming disaster query.
Start check archive.
Start check recovery.
Start check RPO & RTO.
Successfully executed streaming disaster recovery query, result:
{'hadr_cluster_stat': 'archive', 'hadr_failover_stat': '', 'hadr_switchover_stat': '', 'RPO': '0', 'RTO': '0'}
3.1.2 集群状态
[omm@apanweidb02 ~]$ gs_om -t status --detail
[ CMServer State ]
node node_ip instance state
---------------------------------------------------------------------------
1 apanweidb01 <主集群1-ip> 1 /database/panweidb/cm/cm_server Standby
2 apanweidb02 <主集群2-ip> 2 /database/panweidb/cm/cm_server Standby
3 apanweidb03 <主集群3-ip> 3 /database/panweidb/cm/cm_server Primary
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
-----------------------------------------------------------------------------
1 apanweidb01 <主集群1-ip> 6001 /database/panweidb/data P Primary Normal
2 apanweidb02 <主集群2-ip> 6002 /database/panweidb/data S Standby Normal
3 apanweidb03 <主集群3-ip> 6003 /database/panweidb/data S Standby Normal
3.1.3 流复制状态
- 流复制会包含主集群备节点(out)和备集群主节点(out):
[omm@apanweidb01 ~]$ gs_ctl query
[2024-07-15 20:23:10.078][2670696][][gs_ctl]: gs_ctl query ,datadir is /database/panweidb/data
HA state:
local_role : Primary
static_connections : 5
db_state : Normal
detail_information : Normal
Senders info:
sender_pid : 1868144
local_role : Primary
peer_role : Standby
peer_state : Normal
state : Streaming
sender_sent_location : 0/2CCEE3F8
sender_write_location : 0/2CCEE3F8
sender_flush_location : 0/2CCEE3F8
sender_replay_location : 0/2CCEE3F8
receiver_received_location : 0/2CCEE3F8
receiver_write_location : 0/2CCEE3F8
receiver_flush_location : 0/2CCEE3F8
receiver_replay_location : 0/2CCEE3F8
sync_percent : 100%
sync_state : Quorum
sync_priority : 1
sync_most_available : Off
channel : <主集群1-ip>:15401--><主集群2-ip>:40332
sender_pid : 1868143
local_role : Primary
peer_role : Standby
peer_state : Normal
state : Streaming
sender_sent_location : 0/2CCEE3F8
sender_write_location : 0/2CCEE3F8
sender_flush_location : 0/2CCEE3F8
sender_replay_location : 0/2CCEE3F8
receiver_received_location : 0/2CCEE3F8
receiver_write_location : 0/2CCEE3F8
receiver_flush_location : 0/2CCEE3F8
receiver_replay_location : 0/2CCEE3F8
sync_percent : 100%
sync_state : Quorum
sync_priority : 1
sync_most_available : Off
channel : <主集群1-ip>:15401--><主集群3-ip>:65376
sender_pid : 1868145
local_role : Primary
peer_role : Standby
peer_state : Normal
state : Streaming
sender_sent_location : 0/2CCEE3F8
sender_write_location : 0/2CCEE3F8
sender_flush_location : 0/2CCEE3F8
sender_replay_location : 0/2CCEE3F8
receiver_received_location : 0/2CCEE3F8
receiver_write_location : 0/2CCEE3F8
receiver_flush_location : 0/2CCEE3F8
receiver_replay_location : 0/2CCEE3F8
sync_percent : 100%
sync_state : Async
sync_priority : 0
sync_most_available : Off
channel : <主集群1-ip>:15401--><备集群1-ip>:64174
Receiver info:
No information
3.1.4 复制槽状态
- 注意 dn_6001_hadr 是指向备集群主节点的:
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn | dummy_standby | confirmed_flush
--------------+--------+-----------+--------+----------+--------+------+--------------+-------------+---------------+-----------------
dn_6002 | | physical | 0 | | t | | | 0/2CCF8430 | f |
dn_6003 | | physical | 0 | | t | | | 0/2CCF8430 | f |
dn_6001_hadr | | physical | 0 | | t | | | 0/2CCF8430 | f |
(3 rows)
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | state | sender_sent_location
| receiver_write_location | receiver_flush_location | receiver_replay_location | sync_priority | sync_state
----------------+----------+---------+------------------------------------+---------------+-----------------+-------------+-------------------------------+-----------+---------------------
-+-------------------------+-------------------------+--------------------------+---------------+------------
23369422993152 | 10 | omm | WalSender to Standby[hadr_dn_6001] | <备集群1-ip> | | 64174 | 2024-07-11 22:04:30.19872+08 | Streaming | 0/2CCF8430
| 0/2CCF8430 | 0/2CCF8430 | 0/2CCF8430 | 0 | Async
23369553135360 | 10 | omm | WalSender to Standby[dn_6003] | <主集群3-ip> | | 65376 | 2024-07-11 22:04:29.656365+08 | Streaming | 0/2CCF8430
| 0/2CCF8430 | 0/2CCF8430 | 0/2CCF8430 | 1 | Quorum
23369406211840 | 10 | omm | WalSender to Standby[dn_6002] | <主集群2-ip> | | 40332 | 2024-07-11 22:04:29.726514+08 | Streaming | 0/2CCF8430
| 0/2CCF8430 | 0/2CCF8430 | 0/2CCF8430 | 1 | Quorum
(3 rows)
3.2 备集群主节点
3.2.1 容灾状态
- hadr_cluster_stat 为 recovery:
[omm@bpanweidb02 ~]$ gs_sdr -t query
--------------------------------------------------------------------------------
Streaming disaster recovery query 55694ade1dc311efbd28000c29a8e394
--------------------------------------------------------------------------------
Start streaming disaster query.
Start check archive.
Start check recovery.
Start check RPO & RTO.
Successfully executed streaming disaster recovery query, result:
{'hadr_cluster_stat': 'recovery', 'hadr_failover_stat': '', 'hadr_switchover_stat': '', 'RPO': '', 'RTO': ''}
3.2.2 集群状态
- 注意dn的角色:
[omm@bpanweidb01 ~]$ gs_om -t status --detail
[ CMServer State ]
node node_ip instance state
---------------------------------------------------------------------------
1 bpanweidb01 <备集群1-ip> 1 /database/panweidb/cm/cm_server Standby
2 bpanweidb02 <备集群2-ip> 2 /database/panweidb/cm/cm_server Primary
3 bpanweidb03 <备集群3-ip> 3 /database/panweidb/cm/cm_server Standby
[ Cluster State ]
cluster_state : Normal
redistributing : No
balanced : Yes
current_az : AZ_ALL
[ Datanode State ]
node node_ip instance state
-----------------------------------------------------------------------------
1 bpanweidb01 <备集群1-ip> 6001 /database/panweidb/data P Main Standby Normal
2 bpanweidb02 <备集群2-ip> 6002 /database/panweidb/data S Cascade Standby Normal
3 bpanweidb03 <备集群3-ip> 6003 /database/panweidb/data S Cascade Standby Normal
3.2.3 流复制状态
- 流复制会包含主集群主节点(in)备集群备节点(out):
[omm@bpanweidb01 ~]$ gs_ctl query
[2024-07-15 20:25:36.254][3706201][][gs_ctl]: gs_ctl query ,datadir is /database/panweidb/data
HA state:
local_role : Main Standby
static_connections : 5
db_state : Normal
detail_information : Normal
Senders info:
sender_pid : 3250685
local_role : Main Standby
peer_role : Cascade Standby
peer_state : Normal
state : Streaming
sender_sent_location : 0/2CCEE638
sender_write_location : 0/2CCEE638
sender_flush_location : 0/2CCEE638
sender_replay_location : 0/2CCEE638
receiver_received_location : 0/2CCEE638
receiver_write_location : 0/2CCEE638
receiver_flush_location : 0/2CCEE638
receiver_replay_location : 0/2CCEE638
sync_percent : 100%
sync_state : Async
sync_priority : 0
sync_most_available : Off
channel : <备集群1-ip>:15401--><备集群3-ip>:40300
sender_pid : 3250692
local_role : Main Standby
peer_role : Cascade Standby
peer_state : Normal
state : Streaming
sender_sent_location : 0/2CCEE638
sender_write_location : 0/2CCEE638
sender_flush_location : 0/2CCEE638
sender_replay_location : 0/2CCEE638
receiver_received_location : 0/2CCEE638
receiver_write_location : 0/2CCEE638
receiver_flush_location : 0/2CCEE638
receiver_replay_location : 0/2CCEE638
sync_percent : 100%
sync_state : Async
sync_priority : 0
sync_most_available : Off
channel : <备集群1-ip>:15401--><备集群2-ip>:63784
Receiver info:
receiver_pid : 3250885
local_role : Main Standby
peer_role : Primary
peer_state : Normal
state : Normal
sender_sent_location : 0/2CCEE638
sender_write_location : 0/2CCEE638
sender_flush_location : 0/2CCEE638
sender_replay_location : 0/2CCEE638
receiver_received_location : 0/2CCEE638
receiver_write_location : 0/2CCEE638
receiver_flush_location : 0/2CCEE638
receiver_replay_location : 0/2CCEE638
sync_percent : 100%
channel : <备集群1-ip>:64174<--<主集群1-ip>:15401
3.2.4 复制槽状态
postgres=# select * from pg_replication_slots;
slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn | dummy_standby | confirmed_flush
-----------+--------+-----------+--------+----------+--------+------+--------------+-------------+---------------+-----------------
dn_6002 | | physical | 0 | | t | | | 0/2CCF8670 | f |
dn_6003 | | physical | 0 | | t | | | 0/2CCF8670 | f |
(2 rows)
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | state | sender_sent_location | re
ceiver_write_location | receiver_flush_location | receiver_replay_location | sync_priority | sync_state
----------------+----------+---------+-------------------------------+---------------+-----------------+-------------+-------------------------------+-----------+----------------------+---
----------------------+-------------------------+--------------------------+---------------+------------
23145466558208 | 10 | omm | WalSender to Standby[dn_6002] | <备集群2-ip> | | 63784 | 2024-07-11 22:04:16.878049+08 | Streaming | 0/2CCF8670 | 0/
2CCF8670 | 0/2CCF8670 | 0/2CCF8670 | 0 | Async
23145405675264 | 10 | omm | WalSender to Standby[dn_6003] | <备集群3-ip> | | 40300 | 2024-07-11 22:04:16.712418+08 | Streaming | 0/2CCF8670 | 0/
2CCF8670 | 0/2CCF8670 | 0/2CCF8670 | 0 | Async
(2 rows)
参考
- hadr_cluster_stat 含义请参考 openGauss官方文档
最后修改时间:2024-07-15 23:16:37
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




