原作者:王劲松
1. 适用范围
手工切换前确认已关闭mogha,systemctl stop mogha
2. 手工切换
2.1. switchover
在主备机正常时,出于维护的需要,将备机切换为主机,可保证切换过程中数据不丢失。
2.1.1. 查看集群实例主备状态
1节点为主库,2节点为备库。
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Primary Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Standby Normal
2.1.2. 在备节点执行切换主备操作
执行指令:
gs_ctl switchover -D /mogdata/cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_ctl switchover -D /mogdata/cluster_mogdb
[2023-08-24 18:21:26.379][1739316][][gs_ctl]: gs_ctl switchover ,datadir is /mogdata/cluster_mogdb
[2023-08-24 18:21:26.379][1739316][][gs_ctl]: switchover term (1)
[2023-08-24 18:21:26.387][1739316][][gs_ctl]: waiting for server to switchover…
[2023-08-24 18:21:31.435][1739316][][gs_ctl]: done
[2023-08-24 18:21:31.435][1739316][][gs_ctl]: switchover completed (/mogdata/cluster_mogdb)
2.1.3. 确认集群主备状态
1节点为备库,2节点为主库
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Degraded
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Standby Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
2.1.4. 保存数据库主备机器信息
确保gs_om -t refreshconf 命令执行成功,否则再次重启会影响数据库状态。
执行指令:
gs_om -t refreshconf
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_om -t refreshconf
Generating dynamic configuration file for all nodes.
Successfully generated dynamic configuration file.
2.2. Failover
2.2.1. 测试1:在主机异常时,将备机切换为主机
2.2.1.1. 查看集群实例主备状态
2节点为主库,1节点为备库
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Standby Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
2.2.1.2. 主节点关闭数据库,模拟主节点故障
执行指令:
gs_ctl stop -D /mogdata/cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_ctl stop -D /mogdata/cluster_mogdb
[2023-08-24 18:38:44.308][1809341][][gs_ctl]: gs_ctl stopped ,datadir is /mogdata/cluster_mogdb
waiting for server to shut down… done
server stopped
2.2.1.3. 备节点执行主备切换操作
执行指令:
gs_ctl failover -D /mogdata/cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_ctl failover -D /mogdata/cluster_mogdb
[2023-08-24 18:37:44.863][66996][][gs_ctl]: gs_ctl failover ,datadir is /mogdata/cluster_mogdb
[2023-08-24 18:37:44.863][66996][][gs_ctl]: failover term (1)
[2023-08-24 18:37:44.876][66996][][gs_ctl]: waiting for server to failover…
.[2023-08-24 18:37:45.890][66996][][gs_ctl]: done
[2023-08-24 18:37:45.890][66996][][gs_ctl]: failover completed (/mogdata/cluster_mogdb)
2.2.1.4. 确认集群主备状态
备库切换为主库
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Degraded
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Down Manually stopped
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
2.2.1.5. 备机节点以standby模式启动MogDB服务
执行指令:
gs_ctl start -D /mogdata/cluster_mogdb -M standby
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_ctl start -D /mogdata/cluster_mogdb -M standby
…
[2023-08-24 18:47:21.246][1839092][][gs_ctl]: done
[2023-08-24 18:47:21.246][1839092][][gs_ctl]: server started (/mogdata/cluster_mogdb)
2.2.1.6. 确认集群状态:主备节点已顺利切换
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Primary Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Standby Normal
2.2.1.7. 保存数据库主备信息
确保gs_om -t refreshconf 命令执行成功,否则再次重启会影响数据库状态。
执行指令:
gs_om -t refreshconf
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_om -t refreshconf
Generating dynamic configuration file for all nodes.
Successfully generated dynamic configuration file.
2.2.2. 测试2:在主机正常时,将备机切换为主机
将主机重启为备机之后,备机将处于需要修复的状态,需要对备机进行biuld操作
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Primary Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Standby Normal
2.2.2.1. 备节点执行主备切换操作
[sysomm@xtv-mog-test-71-2 ~]$ gs_ctl failover -D /mogdata/cluster_mogdb
[2023-08-24 19:09:15.119][1927475][][gs_ctl]: gs_ctl failover ,datadir is /mogdata/cluster_mogdb
[2023-08-24 19:09:15.119][1927475][][gs_ctl]: failover term (1)
[2023-08-24 19:09:15.125][1927475][][gs_ctl]: waiting for server to failover…
.[2023-08-24 19:09:16.140][1927475][][gs_ctl]: done
[2023-08-24 19:09:16.140][1927475][][gs_ctl]: failover completed (/mogdata/cluster_mogdb)
2.2.2.2. 确认集群主备状态:双主状态
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Unavailable
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Primary Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
2.2.2.3. 在确定降为备机的节点关闭并以standby模式启动MogDB服务
执行指令:
gs_ctl start -D /mogdata/cluster_mogdb -M standby
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_ctl start -D /mogdata/cluster_mogdb -M standby
…
[2023-08-24 19:09:51.644][88033][][gs_ctl]: done
[2023-08-24 19:09:51.644][88033][][gs_ctl]: server started (/mogdata/cluster_mogdb)
2.2.2.4. 确定集群状态
2节点为主库,1节点为备库,但备库和级联备需要修复
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Degraded
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Standby Need repair(WAL)
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
2.2.2.5. 在备库所在节点执行修复命令
执行指令:
gs_ctl build -D /mogdata/cluster_mogdb
执行结果如下:
standby修复:
[sysomm@xtv-mog-test-71-1 ~]$ gs_ctl build -D /mogdata/cluster_mogdb
…
[2023-08-24 19:12:29.340][91056][dn_6001_6002][gs_ctl]: done
[2023-08-24 19:12:29.340][91056][dn_6001_6002][gs_ctl]: server started (/mogdata/cluster_mogdb)
[2023-08-24 19:12:29.340][91056][dn_6001_6002][gs_ctl]: fopen build pid file “/mogdata/cluster_mogdb/gs_build.pid” success
[2023-08-24 19:12:29.340][91056][dn_6001_6002][gs_ctl]: fprintf build pid file “/mogdata/cluster_mogdb/gs_build.pid” success
[2023-08-24 19:12:29.342][91056][dn_6001_6002][gs_ctl]: fsync build pid file “/mogdata/cluster_mogdb/gs_build.pid” success
2.2.2.6. 确认集群状态
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Standby Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
2.2.2.7. 保存数据库主备机器信息
确保gs_om -t refreshconf 命令执行成功,否则再次重启会影响数据库状态。
执行指令:
gs_om -t refreshconf
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_om -t refreshconf
Generating dynamic configuration file for all nodes.
Successfully generated dynamic configuration file.
3. PTK切换
3.1. ptk cluster switchover
3.1.1. 语法
.ptk cluster switchover [flags]
3.1.2. 选项
3.1.2.1. -n, --name sring
- 集群名称
- 数据类型:字符串
3.1.2.2. -H, --host string
- 指定要提升为主库的实例IP
- 数据类型:字符串
3.1.3. 举例
3.1.3.1. 查看集群初始状态
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-1 ~]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Standby Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
3.1.3.2. 使用ptk切换至2节点
执行指令:
./ptk cluster switchover --host 180.2.71.1 -n cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-1 ptk]$ ./ptk cluster switchover --host 180.2.71.1 -n cluster_mogdb
INFO[2023-08-30T16:28:22.159] start switchover, please wait a few moments …
INFO[2023-08-30T16:28:28.234] switchover output:
[2023-08-3016:28:22.171][11186][][gs_ctl]: gs_ctl switchover ,datadir is /mogdata/cluster_mogdb
[2023-08-3016:28:22.171][11186][][gs_ctl]: switchover term (1)
[2023-08-3016:28:22.177][11186][][gs_ctl]: waiting for server to switchover…
[2023-08-3016:28:28.232][11186][][gs_ctl]: done
[2023-08-3016:28:28.232][11186][][gs_ctl]: switchover completed (/mogdata/cluster_mogdb)
INFO[2023-08-30T16:28:28.234] switchover successfully
INFO[2023-08-30T16:28:28.556] time elapsed: 8s
3.1.3.3. 查看集群状态
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-1 ptk]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Primary Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Standby Normal
3.1.3.4. 保存数据库主备机器信息
执行指令:
./ptk cluster refresh -n cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-1 ptk]$ ./ptk cluster refresh -n cluster_mogdb
3.2. ptk cluster failover
3.2.1. 语法
.ptk cluster failover [flags]
3.2.2. 选项
3.2.2.1. -n, --name sring
- 集群名称
- 数据类型:字符串
3.2.2.2. -H, --host string
- 指定要failov的实例IP
- 数据类型:字符串
3.2.3. 举例
3.2.3.1. 集群初始状态
执行指令:
gs_om -t status --detail
执行结果如下:
[sysomm@xtv-mog-test-71-1 ptk]$ gs_om -t status --detail
[ Cluster State ]
cluster_state : Normal
redistributing : No
current_az : AZ_ALL
[ Datanode State ]
node node_ip port instance state
---------------------------------------------------------------------------------------------------
1 xtv-mog-test-71-1 180.2.71.1 26000 6001 /mogdata/cluster_mogdb P Standby Normal
2 xtv-mog-test-71-2 180.2.71.2 26000 6002 /mogdata/cluster_mogdb S Primary Normal
3.2.3.2. 关闭节点2数据库
执行指令:
gs_ctl stop -D /mogdata/cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_ctl stop -D /mogdata/cluster_mogdb
[2023-08-3016:34:03.095][1394634][][gs_ctl]: gs_ctl stopped ,datadir is /mogdata/cluster_mogdb
waiting for server to shut down… done
server stopped
3.2.3.3. 使用ptk cluster failover切换至1节点
执行指令:
./ptk cluster failover --host 180.2.71.1 -n cluster_mogdb
执行结果如下:
./ptk cluster failover --host 180.2.71.1 -n cluster_mogdb
INFO[2022-12-05T21:12:57.365] start failover, please wait a few moments …
INFO[2022-12-05T21:12:58.472] failover successfully
INFO[2022-12-05T21:12:58.472] start refresh cluster_static_config file …
INFO[2022-12-05T21:13:00.057] refresh successfully
3.2.3.4. 查看集群状态
执行指令:
./ptk cluster status -n cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-1 ptk]$ ./ptk cluster status -n cluster_mogdb
[ Cluster State ]
cluster_name : cluster_mogdb
cluster_state : Degraded
database_version : MogDB 5.0.0 (build 503a9ef7)
[ Datanode State ]
cluster_name | id | ip | port | user | nodename | db_role | state | upstream
----------------±-----±-----------±------±-------±---------±------------------±--------±----------
cluster_mogdb | 6001 | 180.2.71.1 | 26000 | sysomm | dn_6001 | primary | Normal | -
| 6002 | 180.2.71.2 | 26000 | sysomm | dn_6002 | standby(previous) | Stopped | -
3.2.3.5. 在确定降为备机的节点关闭并以standby模式启动MogDB服务
执行指令:
gs_ctl start -D /mogdata/cluster_mogdb -M standby
执行结果如下:
[sysomm@xtv-mog-test-71-2 ~]$ gs_ctl start -D /mogdata/cluster_mogdb -M standby
…
[2023-08-3016:51:11.188][1428027][][gs_ctl]: done
[2023-08-3016:51:11.188][1428027][][gs_ctl]: server started (/mogdata/cluster_mogdb)
3.2.3.6. 确认集群状态
执行指令:
./ptk cluster status -n cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-1 ptk]$ ./ptk cluster status -n cluster_mogdb
[ Cluster State ]
cluster_name : cluster_mogdb
cluster_state : Normal
database_version : MogDB 5.0.0 (build 503a9ef7)
[ Datanode State ]
cluster_name | id | ip | port | user | nodename | db_role | state | upstream
----------------±-----±-----------±------±-------±---------±--------±-------±----------
cluster_mogdb | 6001 | 180.2.71.1 | 26000 | sysomm | dn_6001 | primary | Normal | -
| 6002 | 180.2.71.2 | 26000 | sysomm | dn_6002 | standby | Normal | -
3.2.3.7. 保存数据库主备机器信息
执行指令:
…/ptk cluster refresh -n cluster_mogdb
执行结果如下:
[sysomm@xtv-mog-test-71-1 ptk]$ ./ptk cluster refresh -n cluster_mogdb




