暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

级联备做gs_ctl switchover操作测试(5)

原创 jieyancai 2021-09-18
1209

上篇做了一主一备一级联备https://www.modb.pro/db/111511
这篇测试级联环境下主备切换:
备库switchover后,级联备显示需要repair,但build后状态未变。
当再次主备switchover切回,级联备自动normal。上述是否是正常现象还是bug???从测试结果看,貌似就这样了。
集群配置信息:

[omm@omm02 ~]$ gs_om -t view
NodeHeader:
version:301
time:1631939567
nodeCount:3
node:1
azName:AZ1
azPriority:1
node :1
nodeName:omm02
ssh channel :
sshChannel 1:192.168.52.143
datanodeCount :1
datanode 1:
datanodeLocalDataPath :/gaussdb/data/db1
datanodeXlogPath :
datanodeListenIP 1:192.168.52.143
datanodePort :26000
datanodeLocalHAIP 1:192.168.52.143
datanodeLocalHAPort :26001
dn_replication_num: 3
datanodePeer0DataPath :/gaussdb/data/db1
datanodePeer0HAIP 1:192.168.52.144
datanodePeer0HAPort :26001
datanodePeer1DataPath :/gaussdb/data/db1
datanodePeer1HAIP 1:192.168.52.145
datanodePeer1HAPort :26001
azName:AZ1
azPriority:1
node :2
nodeName:omm03
ssh channel :
sshChannel 1:192.168.52.144
datanodeCount :1
datanode 1:
datanodeLocalDataPath :/gaussdb/data/db1
datanodeXlogPath :
datanodeListenIP 1:192.168.52.144
datanodePort :26000
datanodeLocalHAIP 1:192.168.52.144
datanodeLocalHAPort :26001
dn_replication_num: 3
datanodePeer0DataPath :/gaussdb/data/db1
datanodePeer0HAIP 1:192.168.52.143
datanodePeer0HAPort :26001
datanodePeer1DataPath :/gaussdb/data/db1
datanodePeer1HAIP 1:192.168.52.145
datanodePeer1HAPort :26001
azName:AZ1
azPriority:1
node :3
nodeName:omm04
ssh channel :
sshChannel 1:192.168.52.145
datanodeCount :1
datanode 1:
datanodeLocalDataPath :/gaussdb/data/db1
datanodeXlogPath :
datanodeListenIP 1:192.168.52.145
datanodePort :26000
datanodeLocalHAIP 1:192.168.52.145
datanodeLocalHAPort :26001
dn_replication_num: 3
datanodePeer0DataPath :/gaussdb/data/db1
datanodePeer0HAIP 1:192.168.52.143
datanodePeer0HAPort :26001
datanodePeer1DataPath :/gaussdb/data/db1
datanodePeer1HAIP 1:192.168.52.144
datanodePeer1HAPort :26001
主库检查状态:
[omm@omm02 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Primary Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Standby Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Normal

备库做切换:
级联备显示Need repair

[omm@omm03 ~]$ gs_ctl switchover
[2021-09-18 13:47:20.169][23719][][gs_ctl]: gs_ctl switchover ,datadir is /gaussdb/data/db1 
[2021-09-18 13:47:20.169][23719][][gs_ctl]: switchover term (1)
[2021-09-18 13:47:20.174][23719][][gs_ctl]: waiting for server to switchover................
[2021-09-18 13:47:33.242][23719][][gs_ctl]: done
[2021-09-18 13:47:33.242][23719][][gs_ctl]: switchover completed (/gaussdb/data/db1)
[omm@omm03 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Primary Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Need repair(Disconnected)
[omm@omm03 ~]$ 

级联备库做build修复,gs_ctl build -M cascade_standby没效果,本来就是这样吗?从测试结果看,貌似就这样了。

[omm@omm04 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Primary Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Need repair(Connecting)
[omm@omm04 ~]$ gs_ctl build -M cascade_standby
[2021-09-18 13:48:39.784][33084][][gs_ctl]: gs_ctl incremental build ,datadir is /gaussdb/data/db1
waiting for server to shut down........... done
server stopped
[2021-09-18 13:48:47.804][33084][][gs_ctl]:  fopen build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:48:47.804][33084][][gs_ctl]:  fprintf build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:48:47.805][33084][][gs_ctl]:  fsync build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:48:47.815][33084][dn_6001][gs_rewind]: set gaussdb state file when rewind:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2021-09-18 13:48:47.858][33084][dn_6001][gs_rewind]: connected to server: host=192.168.52.144 port=26001 dbname=postgres application_name=gs_rewind connect_timeout=5
[2021-09-18 13:48:47.861][33084][dn_6001][gs_rewind]: connect to primary success
[2021-09-18 13:48:47.862][33084][dn_6001][gs_rewind]: get pg_control success
[2021-09-18 13:48:47.862][33084][dn_6001][gs_rewind]: target server was interrupted in mode 2.
[2021-09-18 13:48:47.862][33084][dn_6001][gs_rewind]: sanityChecks success
[2021-09-18 13:48:47.862][33084][dn_6001][gs_rewind]: find last checkpoint at 0/215AB68 and checkpoint redo at 0/215AAE8 from source control file
[2021-09-18 13:48:47.862][33084][dn_6001][gs_rewind]: find last checkpoint at 0/215AA50 and checkpoint redo at 0/215A9D0 from target control file
[2021-09-18 13:48:47.863][33084][dn_6001][gs_rewind]: find max lsn success, find max lsn rec (0/215AA50) success.

[2021-09-18 13:48:47.874][33084][dn_6001][gs_rewind]: request lsn is 0/215AA50 and its crc(source, target):[1915899835, 1915899835]
[2021-09-18 13:48:47.874][33084][dn_6001][gs_rewind]: find common checkpoint 0/215AA50
[2021-09-18 13:48:47.874][33084][dn_6001][gs_rewind]: find diverge point success
[2021-09-18 13:48:47.874][33084][dn_6001][gs_rewind]: read checkpoint redo (0/215A9D0) success before rewinding.
[2021-09-18 13:48:47.874][33084][dn_6001][gs_rewind]: rewinding from checkpoint redo point at 0/215A9D0 on timeline 1
[2021-09-18 13:48:47.874][33084][dn_6001][gs_rewind]: diverge xlogfile is 000000010000000000000002, older ones will not be copied or removed.
[2021-09-18 13:48:47.875][33084][dn_6001][gs_rewind]: targetFileStatThread success pid 140244960663296.
[2021-09-18 13:48:47.875][33084][dn_6001][gs_rewind]: traverse_datadir start.
[2021-09-18 13:48:47.875][33084][dn_6001][gs_rewind]: reading source file list
[2021-09-18 13:48:47.879][33084][dn_6001][gs_rewind]: filemap_list_to_array start.
[2021-09-18 13:48:47.879][33084][dn_6001][gs_rewind]: filemap_list_to_array end sort start. length is 2372 
[2021-09-18 13:48:47.879][33084][dn_6001][gs_rewind]: sort end.
[2021-09-18 13:48:47.886][33084][dn_6001][gs_rewind]: targetFileStatThread return success.
[2021-09-18 13:48:47.895][33084][dn_6001][gs_rewind]: reading target file list
[2021-09-18 13:48:47.897][33084][dn_6001][gs_rewind]: traverse target datadir success
[2021-09-18 13:48:47.897][33084][dn_6001][gs_rewind]: reading WAL in target
[2021-09-18 13:48:47.897][33084][dn_6001][gs_rewind]: could not read WAL record at 0/215AAE8: invalid record length at 0/215AAE8: wanted 32, got 0
[2021-09-18 13:48:47.898][33084][dn_6001][gs_rewind]: calculate totals rewind success
[2021-09-18 13:48:47.898][33084][dn_6001][gs_rewind]: need to copy 14MB (total source directory size is 69MB)
[2021-09-18 13:48:47.898][33084][dn_6001][gs_rewind]: starting background WAL receiver
[2021-09-18 13:48:47.898][33084][dn_6001][gs_rewind]: Starting copy xlog, start point: 0/215A9D0
[2021-09-18 13:48:47.898][33084][dn_6001][gs_rewind]: in gs_rewind proecess,so no need remove.
[2021-09-18 13:48:47.911][33084][dn_6001][gs_rewind]:  check identify system success
[2021-09-18 13:48:47.912][33084][dn_6001][gs_rewind]:  send START_REPLICATION 0/2000000 success
[2021-09-18 13:48:47.931][33084][dn_6001][gs_rewind]: receiving and unpacking files...
[2021-09-18 13:48:48.027][33084][dn_6001][gs_rewind]: execute file map success
[2021-09-18 13:48:48.030][33084][dn_6001][gs_rewind]: find minRecoveryPoint success from xlog insert location 0/215F170
[2021-09-18 13:48:48.030][33084][dn_6001][gs_rewind]: update pg_control file success, minRecoveryPoint: 0/215F170, ckpLoc:0/215AB68, ckpRedo:0/215AAE8, preCkp:0/215AA50
[2021-09-18 13:48:48.032][33084][dn_6001][gs_rewind]: update pg_dw file success
[2021-09-18 13:48:48.032][33084][dn_6001][gs_rewind]: xlog end point: 0/215F170
[2021-09-18 13:48:48.032][33084][dn_6001][gs_rewind]: waiting for background process to finish streaming...
[2021-09-18 13:48:52.946][33084][dn_6001][gs_rewind]: creating backup label and updating control file
[2021-09-18 13:48:52.946][33084][dn_6001][gs_rewind]: create backup label success
[2021-09-18 13:48:52.946][33084][dn_6001][gs_rewind]: read checkpoint redo (0/215A9D0) success.
[2021-09-18 13:48:52.946][33084][dn_6001][gs_rewind]: read checkpoint rec (0/215AA50) success.
[2021-09-18 13:48:52.946][33084][dn_6001][gs_rewind]: dn incremental build completed.
[2021-09-18 13:48:52.955][33084][dn_6001][gs_rewind]: fetching MOT checkpoint
[2021-09-18 13:48:53.059][33084][dn_6001][gs_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: omm04 

0 LOG:  [Alarm Module]Host IP: 192.168.52.145 

0 LOG:  [Alarm Module]Cluster Name: dbCluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 52

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
0 LOG:  Failed to initialze environment for codegen.
The core dump path is an invalid directory
2021-09-18 13:48:53.261 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 4, max = 4, actual = 4
2021-09-18 13:48:53.261 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
2021-09-18 13:48:53.261 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2021-09-18 13:48:53.261 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: omm04 

2021-09-18 13:48:53.262 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: 192.168.52.145 

2021-09-18 13:48:53.262 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: dbCluster 

2021-09-18 13:48:53.262 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 52

2021-09-18 13:48:53.262 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  Transparent encryption disabled.

2021-09-18 13:48:53.264 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2021-09-18 13:48:53.265 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2021-09-18 13:48:53.265 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 01000  0 [BACKEND] WARNING:  Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (16 Mbytes) or shared memory (1496 Mbytes) is larger.
2021-09-18 13:48:53.278 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [CACHE] LOG:  set data cache  size(12582912)
2021-09-18 13:48:53.278 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [CACHE] LOG:  set metadata cache  size(4194304)
2021-09-18 13:48:53.521 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  gaussdb: fsync file "/gaussdb/data/db1/gaussdb.state.temp" success
2021-09-18 13:48:53.521 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Cascade Standby)
2021-09-18 13:48:53.546 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  max_safe_fds = 977, usable_fds = 1000, already_open = 13
The core dump path is an invalid directory
2021-09-18 13:48:53.548 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  the configure file /gaussdb/app/etc/gscgroup_omm.cfg doesn't exist or the size of configure file has changed. Please create it by root user!
2021-09-18 13:48:53.548 61457dc5.1 [unknown] 140112853255936 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  Failed to parse cgroup config file.

[2021-09-18 13:48:54.068][33084][dn_6001][gs_ctl]:  done
[2021-09-18 13:48:54.068][33084][dn_6001][gs_ctl]: server started (/gaussdb/data/db1)
[2021-09-18 13:48:54.069][33084][dn_6001][gs_ctl]:  fopen build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:48:54.069][33084][dn_6001][gs_ctl]:  fprintf build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:48:54.069][33084][dn_6001][gs_ctl]:  fsync build pid file "/gaussdb/data/db1/gs_build.pid" success
[omm@omm04 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Primary Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Need repair(Connecting)
[omm@omm04 ~]$ 
此时如果不加-M cascade_standby参数build,则变成standby normal,非级联状态。而是一主两备了。
[omm@omm04 ~]$ gs_ctl build
[2021-09-18 13:50:58.253][34998][][gs_ctl]: gs_ctl incremental build ,datadir is /gaussdb/data/db1
waiting for server to shut down......... done
server stopped
[2021-09-18 13:51:04.269][34998][][gs_ctl]:  fopen build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:51:04.269][34998][][gs_ctl]:  fprintf build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:51:04.270][34998][][gs_ctl]:  fsync build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:51:04.281][34998][dn_6001][gs_rewind]: set gaussdb state file when rewind:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2021-09-18 13:51:04.325][34998][dn_6001][gs_rewind]: connected to server: host=192.168.52.144 port=26001 dbname=postgres application_name=gs_rewind connect_timeout=5
[2021-09-18 13:51:04.328][34998][dn_6001][gs_rewind]: connect to primary success
[2021-09-18 13:51:04.329][34998][dn_6001][gs_rewind]: get pg_control success
[2021-09-18 13:51:04.329][34998][dn_6001][gs_rewind]: target server was interrupted in mode 2.
[2021-09-18 13:51:04.329][34998][dn_6001][gs_rewind]: sanityChecks success
[2021-09-18 13:51:04.329][34998][dn_6001][gs_rewind]: find last checkpoint at 0/2160430 and checkpoint redo at 0/21603B0 from source control file
[2021-09-18 13:51:04.329][34998][dn_6001][gs_rewind]: find last checkpoint at 0/215AB68 and checkpoint redo at 0/215AAE8 from target control file
[2021-09-18 13:51:04.330][34998][dn_6001][gs_rewind]: find max lsn success, find max lsn rec (0/215F128) success.

[2021-09-18 13:51:04.340][34998][dn_6001][gs_rewind]: request lsn is 0/215AB68 and its crc(source, target):[117556344, 117556344]
[2021-09-18 13:51:04.341][34998][dn_6001][gs_rewind]: find common checkpoint 0/215AB68
[2021-09-18 13:51:04.341][34998][dn_6001][gs_rewind]: find diverge point success
[2021-09-18 13:51:04.341][34998][dn_6001][gs_rewind]: read checkpoint redo (0/215AAE8) success before rewinding.
[2021-09-18 13:51:04.341][34998][dn_6001][gs_rewind]: rewinding from checkpoint redo point at 0/215AAE8 on timeline 1
[2021-09-18 13:51:04.341][34998][dn_6001][gs_rewind]: diverge xlogfile is 000000010000000000000002, older ones will not be copied or removed.
[2021-09-18 13:51:04.341][34998][dn_6001][gs_rewind]: targetFileStatThread success pid 140259924621056.
[2021-09-18 13:51:04.342][34998][dn_6001][gs_rewind]: reading source file list
[2021-09-18 13:51:04.342][34998][dn_6001][gs_rewind]: traverse_datadir start.
[2021-09-18 13:51:04.345][34998][dn_6001][gs_rewind]: filemap_list_to_array start.
[2021-09-18 13:51:04.345][34998][dn_6001][gs_rewind]: filemap_list_to_array end sort start. length is 2372 
[2021-09-18 13:51:04.346][34998][dn_6001][gs_rewind]: sort end.
[2021-09-18 13:51:04.352][34998][dn_6001][gs_rewind]: targetFileStatThread return success.
[2021-09-18 13:51:04.361][34998][dn_6001][gs_rewind]: reading target file list
[2021-09-18 13:51:04.363][34998][dn_6001][gs_rewind]: traverse target datadir success
[2021-09-18 13:51:04.363][34998][dn_6001][gs_rewind]: reading WAL in target
[2021-09-18 13:51:04.364][34998][dn_6001][gs_rewind]: could not read WAL record at 0/215F170: invalid record length at 0/215F170: wanted 32, got 0
[2021-09-18 13:51:04.364][34998][dn_6001][gs_rewind]: calculate totals rewind success
[2021-09-18 13:51:04.364][34998][dn_6001][gs_rewind]: need to copy 14MB (total source directory size is 69MB)
[2021-09-18 13:51:04.364][34998][dn_6001][gs_rewind]: starting background WAL receiver
[2021-09-18 13:51:04.364][34998][dn_6001][gs_rewind]: Starting copy xlog, start point: 0/215AAE8
[2021-09-18 13:51:04.364][34998][dn_6001][gs_rewind]: in gs_rewind proecess,so no need remove.
[2021-09-18 13:51:04.377][34998][dn_6001][gs_rewind]:  check identify system success
[2021-09-18 13:51:04.377][34998][dn_6001][gs_rewind]:  send START_REPLICATION 0/2000000 success
[2021-09-18 13:51:04.390][34998][dn_6001][gs_rewind]: receiving and unpacking files...
[2021-09-18 13:51:04.485][34998][dn_6001][gs_rewind]: execute file map success
[2021-09-18 13:51:04.486][34998][dn_6001][gs_rewind]: find minRecoveryPoint success from xlog insert location 0/21648C0
[2021-09-18 13:51:04.487][34998][dn_6001][gs_rewind]: update pg_control file success, minRecoveryPoint: 0/21648C0, ckpLoc:0/2160430, ckpRedo:0/21603B0, preCkp:0/2160318
[2021-09-18 13:51:04.488][34998][dn_6001][gs_rewind]: update pg_dw file success
[2021-09-18 13:51:04.489][34998][dn_6001][gs_rewind]: xlog end point: 0/21648C0
[2021-09-18 13:51:04.489][34998][dn_6001][gs_rewind]: waiting for background process to finish streaming...
[2021-09-18 13:51:09.399][34998][dn_6001][gs_rewind]: creating backup label and updating control file
[2021-09-18 13:51:09.399][34998][dn_6001][gs_rewind]: create backup label success
[2021-09-18 13:51:09.399][34998][dn_6001][gs_rewind]: read checkpoint redo (0/215AAE8) success.
[2021-09-18 13:51:09.399][34998][dn_6001][gs_rewind]: read checkpoint rec (0/215AB68) success.
[2021-09-18 13:51:09.399][34998][dn_6001][gs_rewind]: dn incremental build completed.
[2021-09-18 13:51:09.408][34998][dn_6001][gs_rewind]: fetching MOT checkpoint
[2021-09-18 13:51:09.507][34998][dn_6001][gs_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: omm04 

0 LOG:  [Alarm Module]Host IP: 192.168.52.145 

0 LOG:  [Alarm Module]Cluster Name: dbCluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 52

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
0 LOG:  Failed to initialze environment for codegen.
The core dump path is an invalid directory
2021-09-18 13:51:09.718 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 4, max = 4, actual = 4
2021-09-18 13:51:09.718 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
2021-09-18 13:51:09.718 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2021-09-18 13:51:09.718 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: omm04 

2021-09-18 13:51:09.718 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: 192.168.52.145 

2021-09-18 13:51:09.718 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: dbCluster 

2021-09-18 13:51:09.719 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 52

2021-09-18 13:51:09.719 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  Transparent encryption disabled.

2021-09-18 13:51:09.721 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2021-09-18 13:51:09.723 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2021-09-18 13:51:09.723 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 01000  0 [BACKEND] WARNING:  Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (16 Mbytes) or shared memory (1496 Mbytes) is larger.
2021-09-18 13:51:09.735 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [CACHE] LOG:  set data cache  size(12582912)
2021-09-18 13:51:09.736 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [CACHE] LOG:  set metadata cache  size(4194304)
2021-09-18 13:51:09.968 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  gaussdb: fsync file "/gaussdb/data/db1/gaussdb.state.temp" success
2021-09-18 13:51:09.968 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Standby)
2021-09-18 13:51:09.991 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  max_safe_fds = 977, usable_fds = 1000, already_open = 13
The core dump path is an invalid directory
2021-09-18 13:51:09.994 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  the configure file /gaussdb/app/etc/gscgroup_omm.cfg doesn't exist or the size of configure file has changed. Please create it by root user!
2021-09-18 13:51:09.994 61457e4d.1 [unknown] 140200662243072 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  Failed to parse cgroup config file.

[2021-09-18 13:51:10.516][34998][dn_6001][gs_ctl]:  done
[2021-09-18 13:51:10.516][34998][dn_6001][gs_ctl]: server started (/gaussdb/data/db1)
[2021-09-18 13:51:10.516][34998][dn_6001][gs_ctl]:  fopen build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:51:10.516][34998][dn_6001][gs_ctl]:  fprintf build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:51:10.516][34998][dn_6001][gs_ctl]:  fsync build pid file "/gaussdb/data/db1/gs_build.pid" success
[omm@omm04 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Primary Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Standby Normal
[omm@omm04 ~]$ 

还原级联状态:
[omm@omm04 ~]$ gs_ctl build -M cascade_standby
[2021-09-18 13:52:19.500][35405][][gs_ctl]: gs_ctl incremental build ,datadir is /gaussdb/data/db1
waiting for server to shut down.... done
server stopped
[2021-09-18 13:52:20.510][35405][][gs_ctl]:  fopen build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:52:20.510][35405][][gs_ctl]:  fprintf build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:52:20.511][35405][][gs_ctl]:  fsync build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:52:20.523][35405][dn_6001][gs_rewind]: set gaussdb state file when rewind:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
[2021-09-18 13:52:20.565][35405][dn_6001][gs_rewind]: connected to server: host=192.168.52.144 port=26001 dbname=postgres application_name=gs_rewind connect_timeout=5
[2021-09-18 13:52:20.569][35405][dn_6001][gs_rewind]: connect to primary success
[2021-09-18 13:52:20.570][35405][dn_6001][gs_rewind]: get pg_control success
[2021-09-18 13:52:20.570][35405][dn_6001][gs_rewind]: target server was interrupted in mode 2.
[2021-09-18 13:52:20.570][35405][dn_6001][gs_rewind]: sanityChecks success
[2021-09-18 13:52:20.570][35405][dn_6001][gs_rewind]: find last checkpoint at 0/2165A50 and checkpoint redo at 0/21659D0 from source control file
[2021-09-18 13:52:20.570][35405][dn_6001][gs_rewind]: find last checkpoint at 0/2165A50 and checkpoint redo at 0/21659D0 from target control file
[2021-09-18 13:52:20.571][35405][dn_6001][gs_rewind]: find max lsn success, find max lsn rec (0/2165A50) success.

[2021-09-18 13:52:20.581][35405][dn_6001][gs_rewind]: request lsn is 0/2165A50 and its crc(source, target):[3990853794, 3990853794]
[2021-09-18 13:52:20.581][35405][dn_6001][gs_rewind]: find common checkpoint 0/2165A50
[2021-09-18 13:52:20.581][35405][dn_6001][gs_rewind]: find diverge point success
[2021-09-18 13:52:20.581][35405][dn_6001][gs_rewind]: read checkpoint redo (0/21659D0) success before rewinding.
[2021-09-18 13:52:20.581][35405][dn_6001][gs_rewind]: rewinding from checkpoint redo point at 0/21659D0 on timeline 1
[2021-09-18 13:52:20.581][35405][dn_6001][gs_rewind]: diverge xlogfile is 000000010000000000000002, older ones will not be copied or removed.
[2021-09-18 13:52:20.582][35405][dn_6001][gs_rewind]: targetFileStatThread success pid 140329515140864.
[2021-09-18 13:52:20.582][35405][dn_6001][gs_rewind]: reading source file list
[2021-09-18 13:52:20.582][35405][dn_6001][gs_rewind]: traverse_datadir start.
[2021-09-18 13:52:20.586][35405][dn_6001][gs_rewind]: filemap_list_to_array start.
[2021-09-18 13:52:20.586][35405][dn_6001][gs_rewind]: filemap_list_to_array end sort start. length is 2372 
[2021-09-18 13:52:20.586][35405][dn_6001][gs_rewind]: sort end.
[2021-09-18 13:52:20.592][35405][dn_6001][gs_rewind]: targetFileStatThread return success.
[2021-09-18 13:52:20.600][35405][dn_6001][gs_rewind]: reading target file list
[2021-09-18 13:52:20.604][35405][dn_6001][gs_rewind]: traverse target datadir success
[2021-09-18 13:52:20.604][35405][dn_6001][gs_rewind]: reading WAL in target
[2021-09-18 13:52:20.604][35405][dn_6001][gs_rewind]: could not read WAL record at 0/2165AE8: invalid record length at 0/2165AE8: wanted 32, got 0
[2021-09-18 13:52:20.605][35405][dn_6001][gs_rewind]: calculate totals rewind success
[2021-09-18 13:52:20.605][35405][dn_6001][gs_rewind]: need to copy 14MB (total source directory size is 69MB)
[2021-09-18 13:52:20.605][35405][dn_6001][gs_rewind]: starting background WAL receiver
[2021-09-18 13:52:20.605][35405][dn_6001][gs_rewind]: Starting copy xlog, start point: 0/21659D0
[2021-09-18 13:52:20.605][35405][dn_6001][gs_rewind]: in gs_rewind proecess,so no need remove.
[2021-09-18 13:52:20.620][35405][dn_6001][gs_rewind]:  check identify system success
[2021-09-18 13:52:20.620][35405][dn_6001][gs_rewind]:  send START_REPLICATION 0/2000000 success
[2021-09-18 13:52:20.632][35405][dn_6001][gs_rewind]: receiving and unpacking files...
[2021-09-18 13:52:20.734][35405][dn_6001][gs_rewind]: execute file map success
[2021-09-18 13:52:20.736][35405][dn_6001][gs_rewind]: find minRecoveryPoint success from xlog insert location 0/216AED0
[2021-09-18 13:52:20.736][35405][dn_6001][gs_rewind]: update pg_control file success, minRecoveryPoint: 0/216AED0, ckpLoc:0/2165A50, ckpRedo:0/21659D0, preCkp:0/2160430
[2021-09-18 13:52:20.738][35405][dn_6001][gs_rewind]: update pg_dw file success
[2021-09-18 13:52:20.738][35405][dn_6001][gs_rewind]: xlog end point: 0/216AED0
[2021-09-18 13:52:20.738][35405][dn_6001][gs_rewind]: waiting for background process to finish streaming...
[2021-09-18 13:52:25.649][35405][dn_6001][gs_rewind]: creating backup label and updating control file
[2021-09-18 13:52:25.649][35405][dn_6001][gs_rewind]: create backup label success
[2021-09-18 13:52:25.649][35405][dn_6001][gs_rewind]: read checkpoint redo (0/21659D0) success.
[2021-09-18 13:52:25.649][35405][dn_6001][gs_rewind]: read checkpoint rec (0/2165A50) success.
[2021-09-18 13:52:25.649][35405][dn_6001][gs_rewind]: dn incremental build completed.
[2021-09-18 13:52:25.659][35405][dn_6001][gs_rewind]: fetching MOT checkpoint
[2021-09-18 13:52:25.757][35405][dn_6001][gs_ctl]: waiting for server to start...
.0 LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

0 LOG:  [Alarm Module]Host Name: omm04 

0 LOG:  [Alarm Module]Host IP: 192.168.52.145 

0 LOG:  [Alarm Module]Cluster Name: dbCluster 

0 LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 52

0 WARNING:  failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
0 WARNING:  failed to parse feature control file: gaussdb.version.
0 WARNING:  Failed to load the product control file, so gaussdb cannot distinguish product version.
0 LOG:  Failed to initialze environment for codegen.
The core dump path is an invalid directory
2021-09-18 13:52:25.956 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 DB010  0 [REDO] LOG:  Recovery parallelism, cpu count = 4, max = 4, actual = 4
2021-09-18 13:52:25.956 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 DB010  0 [REDO] LOG:  ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
2021-09-18 13:52:25.956 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]can not read GAUSS_WARNING_TYPE env.

2021-09-18 13:52:25.956 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Host Name: omm04 

2021-09-18 13:52:25.956 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Host IP: 192.168.52.145 

2021-09-18 13:52:25.956 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Cluster Name: dbCluster 

2021-09-18 13:52:25.957 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 52

2021-09-18 13:52:25.957 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  Transparent encryption disabled.

2021-09-18 13:52:25.959 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  loaded library "security_plugin"
2021-09-18 13:52:25.961 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
2021-09-18 13:52:25.961 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 01000  0 [BACKEND] WARNING:  Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (16 Mbytes) or shared memory (1496 Mbytes) is larger.
2021-09-18 13:52:25.973 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [CACHE] LOG:  set data cache  size(12582912)
2021-09-18 13:52:25.973 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [CACHE] LOG:  set metadata cache  size(4194304)
2021-09-18 13:52:26.203 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  gaussdb: fsync file "/gaussdb/data/db1/gaussdb.state.temp" success
2021-09-18 13:52:26.203 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  create gaussdb state file success: db state(STARTING_STATE), server mode(Cascade Standby)
2021-09-18 13:52:26.229 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  max_safe_fds = 977, usable_fds = 1000, already_open = 13
The core dump path is an invalid directory
2021-09-18 13:52:26.231 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  the configure file /gaussdb/app/etc/gscgroup_omm.cfg doesn't exist or the size of configure file has changed. Please create it by root user!
2021-09-18 13:52:26.231 61457e99.1 [unknown] 140366908442368 [unknown] 0 dn_6001 00000  0 [BACKEND] LOG:  Failed to parse cgroup config file.

[2021-09-18 13:52:26.766][35405][dn_6001][gs_ctl]:  done
[2021-09-18 13:52:26.766][35405][dn_6001][gs_ctl]: server started (/gaussdb/data/db1)
[2021-09-18 13:52:26.766][35405][dn_6001][gs_ctl]:  fopen build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:52:26.766][35405][dn_6001][gs_ctl]:  fprintf build pid file "/gaussdb/data/db1/gs_build.pid" success
[2021-09-18 13:52:26.766][35405][dn_6001][gs_ctl]:  fsync build pid file "/gaussdb/data/db1/gs_build.pid" success
[omm@omm04 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Primary Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Need repair(Connecting)
[omm@omm04 ~]$ 

原主库切换回主库又变回正常:

[omm@omm02 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Primary Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Need repair(Connecting)
[omm@omm02 ~]$ gs_ctl switchover
[2021-09-18 13:53:06.766][44866][][gs_ctl]: gs_ctl switchover ,datadir is /gaussdb/data/db1 
[2021-09-18 13:53:06.766][44866][][gs_ctl]: switchover term (1)
[2021-09-18 13:53:06.770][44866][][gs_ctl]: waiting for server to switchover..........
[2021-09-18 13:53:13.814][44866][][gs_ctl]: done
[2021-09-18 13:53:13.814][44866][][gs_ctl]: switchover completed (/gaussdb/data/db1)
[omm@omm02 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Primary Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Standby Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Normal
[omm@omm02 ~]$ 

现在尝试将级联备做switchover,级联备库变成standby,备库变成级联备cascade,状态正常:

[omm@omm04 ~]$ gs_ctl switchover
[2021-09-18 13:56:04.430][36619][][gs_ctl]: gs_ctl switchover ,datadir is /gaussdb/data/db1 
[2021-09-18 13:56:04.430][36619][][gs_ctl]: switchover term (1)
[2021-09-18 13:56:04.435][36619][][gs_ctl]: waiting for server to switchover.......
[2021-09-18 13:56:08.461][36619][][gs_ctl]: done
[2021-09-18 13:56:08.461][36619][][gs_ctl]: switchover completed (/gaussdb/data/db1)
[omm@omm04 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Primary Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Cascade Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Standby Normal
[omm@omm04 ~]$ 

再做一次switchover后,主库变成备库,自己变成主库,原备库级联备需修复repair:

[omm@omm04 ~]$ gs_ctl switchover
[2021-09-18 13:57:23.605][36969][][gs_ctl]: gs_ctl switchover ,datadir is /gaussdb/data/db1 
[2021-09-18 13:57:23.605][36969][][gs_ctl]: switchover term (1)
[2021-09-18 13:57:23.610][36969][][gs_ctl]: waiting for server to switchover..............
[2021-09-18 13:57:34.672][36969][][gs_ctl]: done
[2021-09-18 13:57:34.672][36969][][gs_ctl]: switchover completed (/gaussdb/data/db1)
[omm@omm04 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Cascade Need repair(Connecting) | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Primary Normal
[omm@omm04 ~]$ 

在原主库swithover,再在原备库swithover后,又恢复最初的正常状态。

主库:
[omm@omm02 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Degraded
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Standby Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Cascade Need repair(Connecting) | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Primary Normal
[omm@omm02 ~]$ gs_ctl switchover
[2021-09-18 13:59:41.776][46084][][gs_ctl]: gs_ctl switchover ,datadir is /gaussdb/data/db1 
[2021-09-18 13:59:41.776][46084][][gs_ctl]: switchover term (1)
[2021-09-18 13:59:41.781][46084][][gs_ctl]: waiting for server to switchover.................
[2021-09-18 13:59:55.902][46084][][gs_ctl]: done
[2021-09-18 13:59:55.902][46084][][gs_ctl]: switchover completed (/gaussdb/data/db1)
[omm@omm02 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Primary Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Cascade Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Standby Normal
[omm@omm02 ~]$ 
备库:
[omm@omm03 ~]$ gs_ctl switchover
[2021-09-18 14:01:23.144][26890][][gs_ctl]: gs_ctl switchover ,datadir is /gaussdb/data/db1 
[2021-09-18 14:01:23.144][26890][][gs_ctl]: switchover term (1)
[2021-09-18 14:01:23.149][26890][][gs_ctl]: waiting for server to switchover.......
[2021-09-18 14:01:27.176][26890][][gs_ctl]: done
[2021-09-18 14:01:27.176][26890][][gs_ctl]: switchover completed (/gaussdb/data/db1)
[omm@omm03 ~]$ gs_om -t status --detail
[   Cluster State   ]

cluster_state   : Normal
redistributing  : No
current_az      : AZ_ALL

[  Datanode State   ]

node     node_ip         instance                  state            | node     node_ip         instance                  state            | node     node_ip         instance                  state
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1  omm02 192.168.52.143  6001 /gaussdb/data/db1 P Primary Normal | 2  omm03 192.168.52.144  6002 /gaussdb/data/db1 S Standby Normal | 3  omm04 192.168.52.145  6003 /gaussdb/data/db1 C Cascade Normal
[omm@omm03 ~]$ 

failover操作类似,不再列出:
failover后,主备关系可通过gs_ctl build -M standby重建,级联备关系通过gs_ctl build -M cascade_standby重建。
相关参考:
https://zhuanlan.zhihu.com/p/367867475

最后修改时间:2021-09-18 14:21:13
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论