故障现象:
mgr的主库网络不可达UNREACHABLE,ssh也无法连接主机


2024-03-12T00:50:51.241803+08:00 0 [Warning] [MY-011493] [Repl] Plugin group_replication reported: 'Member with address 192.168.1.51:3307 has become unreachable.'
2024-03-12T00:50:51.241956+08:00 0 [ERROR] [MY-011495] [Repl] Plugin group_replication reported: 'This server is not able to reach a majority of members in the group. This server will now block
all updates. The server will remain blocked until contact with the majority is restored. It is possible to use group_replication_force_members to force a new group membership.'
mgr两节点,主备状态,由于主库系统故障网络异常,现决定将备库提升为主库,确保应用任务能继续进行。
处理思路如下:
- 尝试切换主库,失败
- 停止复制进程
- 检查gtid,后续主库恢复可参考比较大小值
- 启动复制引导为主库
- 检查gtid,后续主库恢复可参考比较大小值
- 检查浮动IP已切换正常。
详细处理过程日志如下:
>ssh root@192.168.1.52 -o ServerAliveInterval=60
root@192.168.1.52's password:
Last login: Tue Mar 12 09:35:33 2024 from 192.168.1.33
[root@ycla ~]# mysql -uroot -pxxx -Dxxx
mysql: [Warning] Using a password on the command line interface can be insecure.
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 21807
Server version: 8.0.36 MySQL Community Server - GPL
Copyright (c) 2000, 2024, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | 773123a4-daad-11ee-930b-3009f923fbf1 | 192.168.1.51 | 3307 | UNREACHABLE | PRIMARY | 8.0.36 | XCom |
| group_replication_applier | b2b77622-daad-11ee-a732-3009f925119f | 192.168.1.52 | 3307 | ONLINE | SECONDARY | 8.0.36 | XCom |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
2 rows in set (0.00 sec)
mysql> select group_replication_set_as_primary('b2b77622-daad-11ee-a732-3009f925119f');
ERROR 1123 (HY000): Can't initialize function 'group_replication_set_as_primary'; Member must be ONLINE and in the majority partition.
mysql> SELECT @@gtid_executed\G
*************************** 1. row ***************************
@@gtid_executed: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-432:1000422
1 row in set (0.00 sec)
mysql> STOP GROUP_REPLICATION;
Query OK, 0 rows affected (30.43 sec)
mysql>
mysql>
mysql> SELECT @@gtid_executed\G
*************************** 1. row ***************************
@@gtid_executed: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-432:1000422
1 row in set (0.00 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | b2b77622-daad-11ee-a732-3009f925119f | 192.168.1.52 | 3307 | OFFLINE | | | XCom |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.00 sec)
mysql> SET GLOBAL group_replication_bootstrap_group=ON;
Query OK, 0 rows affected (0.00 sec)
mysql> START GROUP_REPLICATION;
Query OK, 0 rows affected (1.14 sec)
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
| group_replication_applier | b2b77622-daad-11ee-a732-3009f925119f | 192.168.1.52 | 3307 | ONLINE | PRIMARY | 8.0.36 | XCom |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
1 row in set (0.00 sec)
mysql> SELECT @@gtid_executed\G
*************************** 1. row ***************************
@@gtid_executed: aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa:1-433:1000422
1 row in set (0.00 sec)
mysql>
[root@ycla ~]# ip a|grep 192.168.1
inet 192.168.1.52/24 brd 192.168.1.255 scope global noprefixroute bond0
inet 192.168.1.50/24 brd 192.168.1.255 scope global secondary bond0:1
信息记录:
[root@ycla ~]# free -g
total used free shared buff/cache available
Mem: 755 233 521 0 0 520
Swap: 63 0 63
[root@ycla ~]# cat /proc/meminfo |grep Hu
AnonHugePages: 4096 kB
HugePages_Total: 102400
HugePages_Free: 99310
HugePages_Rsvd: 97230
HugePages_Surp: 0
Hugepagesize: 2048 kB
[root@ycla ~]# cat /etc/my.cnf
[client]
port=3307
socket=/db/mysql/mysql-8.0.36/mysql.sock
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
port=3307
user=mysql
socket=/db/mysql/mysql-8.0.36/mysql.sock
basedir=/db/mysql/mysql-8.0.36
datadir=/db/mysql/mysql-8.0.36/data
lower-case-table-names=1
#authentication_policy=mysql_native_password
authentication_policy=caching_sha2_password
#skip-grant-tables
#innodb_buffer_pool_size=105109258240
innodb_buffer_pool_size=190G
innodb_redo_log_capacity=1024M
slow_query_log=ON
slow_query_log_file=/db/mysql/mysql-8.0.36/data/ycla-slow.log
long_query_time=0.5
log_timestamps=SYSTEM
log_queries_not_using_indexes=off
max_connections=5000
wait_timeout = 600
interactive_timeout = 600
character_set_server=utf8mb4
innodb_flush_log_at_trx_commit=2
skip-log-bin
collation-server = utf8mb4_0900_ai_ci
init_connect='SET NAMES utf8mb4 COLLATE utf8mb4_0900_ai_ci'
skip-character-set-client-handshake = true
skip-name-resolve
sql_generate_invisible_primary_key=ON
large-pages
server-id=52
log_bin=binlog-bin
log_slave_updates=ON
binlog_format=ROW
binlog_checksum=NONE
master_info_repository=TABLE
relay_log_info_repository=TABLE
gtid_mode=ON
enforce_gtid_consistency=true
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"
transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address="192.168.1.52:33071"
loose-group_replication_group_seeds="192.168.1.51:33071,192.168.1.52:33071"
loose-group_replication_bootstrap_group=OFF
report_host=192.168.1.52
report_port=3307
loose-group_replication_recovery_get_public_key=ON
admin_address=192.168.1.52
admin_port=33072
create_admin_listener_thread=1
log_bin_trust_function_creators=1
[root@ycla ~]# crontab -l
* * * * * /etc/vip_check.sh > /dev/null 2>&1
[root@ycla ~]# cat /etc/vip_check.sh
step=3
for ((i = 0; i < 60; i = (i + step))); do
$(/etc/vip.sh)
sleep $step
done
exit 0
[root@ycla ~]#
[root@ycla ~]# cat /etc/vip.sh
#!/bin/bash
dbstats=`/db/mysql/mysql-8.0.36/bin/mysql -u root -pxxx -P 33072 -e "select MEMBER_HOST,MEMBER_ROLE from performance_schema.replication_group_members;"|grep "10.1.1.52"|awk '{print $2}'|grep "PRIMARY"|wc -l`
ip=`/usr/sbin/ip a|grep bond0:1|wc -l`
if [[ "${dbstats}" -eq 1 ]] ; then
if [[ "${ip}" -eq 0 ]]; then
/usr/sbin/ifconfig bond0:1 10.1.1.50 netmask 255.255.255.0 up
/usr/sbin/arping -b -s 10.1.1.50 10.1.1.3 -c 3
fi
else
if [[ "${ip}" -gt 0 ]]; then
/usr/sbin/ifconfig bond0:1 down
fi
fi
[root@ycla ~]# lscpu|grep CPU
CPU op-mode(s): 32-bit, 64-bit
CPU(s): 128
On-line CPU(s) list: 0-127
CPU family: 24
CPU MHz: 1200.000
CPU max MHz: 2000.0000
CPU min MHz: 1200.0000
NUMA node0 CPU(s): 0-7,64-71
NUMA node1 CPU(s): 8-15,72-79
NUMA node2 CPU(s): 16-23,80-87
NUMA node3 CPU(s): 24-31,88-95
NUMA node4 CPU(s): 32-39,96-103
NUMA node5 CPU(s): 40-47,104-111
NUMA node6 CPU(s): 48-55,112-119
NUMA node7 CPU(s): 56-63,120-127
最后修改时间:2024-03-12 14:26:09
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




