暂无图片
暂无图片
4
暂无图片
暂无图片
暂无图片

MGR 节点全部关闭后的启动

原创 大表哥 2022-04-20
2403

image.png

大家好! 今天大表哥分享一个真实的MGR的测试环境重启的案例。

近日公司的测试数据库环境掉电,造成了机器意外的关机,测试环境大多数都是mysql 3 节点的MGR, 重启之后,发现有些节点的状态是异常的并且无法加不到MGR的组复制组中。

我们今天在实验环境下模拟一下:

如果MGR节点全部down掉:

可能是MGR集群意外crash,并且集群已经做了多次主从切换,并且最后所有的节点全部关闭
或者
可能是计划内的人工维护, 先关闭从库,最后关闭主库

我们改如何重启动MGR集群呢?

在哪个节点引导 group_replication_bootstrap_group 组复制,保证顺利启动为不丢数据?

我们的3台试验节点环境:

mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec)

我们在主节点 3308 创建表:插入测试数据

mysql> create database mgrdb; Query OK, 1 row affected (0.01 sec) mysql> use mgrdb; Database changed mysql> create table t_mgr(id int not null primary key, msg varchar(100)); Query OK, 0 rows affected (0.06 sec) mysql> insert into t_mgr value ('1','before mgr crash'); Query OK, 1 row affected (0.02 sec)

此时,我们模拟OOM, kill掉一个从库 3307的mysqld 进程:

[jason@VM-24-9-centos ~]$ kill -9 27476 28812

这个时候MGR的集群试图是:

mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 2 rows in set (0.00 sec)

现在这个MGR的集群依然是可用的,我们来继续插入数据

mysql> insert into t_mgr value ('2','one node crash'); Query OK, 1 row affected (0.01 sec)

然后我们来正常关闭节点 3306:

mysql> shutdown; Query OK, 0 rows affected (0.00 sec)

这个时候我们只有一个节点存活 3308:

我们是依然可以插入成功的:
(这个是MGR的特点,正常shutdown, MGR会认为是节点正常的退出,所以只有一个节点,依然是可读可写)

mysql> use mgrdb; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> insert into t_mgr value ('3','one node crash,one was shutdown'); Query OK, 1 row affected (0.00 sec)

最后我们 kill点 3308这个唯一存活的节点的mysqld 进程。

[jason@VM-24-9-centos ~]$ kill -9 5780 7115

至此MGR中的3个节点全部都已经down了, 我们现在尝试恢复一下:

首先先把3个几点mysqld全部启动:

nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql3306/my3306.cnf > /dev/null --user=jason 2>&1 & nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3307/my3307.cnf > /dev/null --user=jason 2>&1 & nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3308/my3308.cnf > /dev/null --user=jason 2>&1 &

我们看到3个节点的MGR的状态都是offline的:

mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | OFFLINE | | | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 1 row in set (0.00 sec) mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | OFFLINE | | | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 1 row in set (0.01 sec) mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | OFFLINE | | | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 1 row in set (0.01 sec)

我们应该在哪个节点去引导group_replication_bootstrap_group? 这里假设我们并不知道集群到底是如何crash的顺序。

我们只能通过 @@gtid_executed 来判断哪个节点最靠后了,证明他的数据是最后的,他就是最后的主库。

mysql> select @@port ,@@gtid_executed; +--------+--------------------------------------------------------------------------------------+ | @@port | @@gtid_executed | +--------+--------------------------------------------------------------------------------------+ | 3306 | 9915aff5-bbd6-11ec-9110-525400743c33:1-12, b965e3bb-be27-11ec-ba80-525400743c33:1-22 | +--------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> select @@port,@@gtid_executed; +--------+--------------------------------------------------------------------------------------+ | @@port | @@gtid_executed | +--------+--------------------------------------------------------------------------------------+ | 3307 | 9915aff5-bbd6-11ec-9110-525400743c33:1-12, b965e3bb-be27-11ec-ba80-525400743c33:1-21 | +--------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec mysql> select @@port,@@gtid_executed; +--------+--------------------------------------------------------------------------------------+ | @@port | @@gtid_executed | +--------+--------------------------------------------------------------------------------------+ | 3308 | 9915aff5-bbd6-11ec-9110-525400743c33:1-12, b965e3bb-be27-11ec-ba80-525400743c33:1-23 | +--------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

我们可以看到 3308 的GTID 是最大的,那么他就是最后的主库,我们需要在 3308上引导组复制 group_replication_bootstrap_group

3306–》 b965e3bb-be27-11ec-ba80-525400743c33:1-22

3307–》 b965e3bb-be27-11ec-ba80-525400743c33:1-21

3308–》 b965e3bb-be27-11ec-ba80-525400743c33:1-23

节点3308 执行: mysql> set global group_replication_bootstrap_group = ON; Query OK, 0 rows affected (0.00 sec) mysql> start group_replication; Query OK, 0 rows affected (3.14 sec) mysql> set global group_replication_bootstrap_group = OFF; Query OK, 0 rows affected (0.00 sec) 从节点 3306 执行: mysql> start group_replication; Query OK, 0 rows affected (3.32 sec) 从节点 3307 执行: mysql> start group_replication; Query OK, 0 rows affected (3.32 sec)

全部启动完毕后,MGR集群满血复活:

mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec) 我们查看数据: 全部都存在 mysql> select * from t_mgr; +----+---------------------------------+ | id | msg | +----+---------------------------------+ | 1 | before mgr crash | | 2 | one node crash | | 3 | one node crash,one was shutdown | +----+---------------------------------+ 3 rows in set (0.00 sec)

如果我们很不幸运,引导在错误的节点(GTID低的节点上)上引导了 group_replication_bootstrap_group, 那么我们将面临丢数据的危险,并且GTID 大的节点是拒绝加入到MGR组里面的。

下面我们具体来看一下: 测试环境,丢失数据可以接受的情况

我们先清空上一次实验的表:

mysql> drop table t_mgr; Query OK, 0 rows affected (0.04 sec)

最原始MGR的主从信息:

mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | PRIMARY | 8.0.20 | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec)

现在我们向主库3308 插入记录:

mysql> create table t_mgr(id int not null primary key, msg varchar(100)); Query OK, 0 rows affected (0.04 sec) mysql> insert into t_mgr value ('1','before mgr crash'); Query OK, 1 row affected (0.01 sec)

然后我们来正常关闭数据库3306 :

mysql> select @@port; +--------+ | @@port | +--------+ | 3306 | +--------+ 1 row in set (0.00 sec) mysql> shutdown; Query OK, 0 rows affected (0.00 sec)

我们再次向主库 3008插入数据:

mysql> insert into t_mgr value ('2','one node shutdown'); Query OK, 1 row affected (0.01 sec)

然后我们再次关闭 3307 这个从库:

mysql> select @@port; +--------+ | @@port | +--------+ | 3307 | +--------+ 1 row in set (0.00 sec) mysql> shutdown; Query OK, 0 rows affected (0.00 sec)

我们再一次向主库 3008插入数据:

mysql> insert into t_mgr value ('3','two nodes shutdown'); Query OK, 1 row affected (0.01 sec)

我们最后关闭掉主库 3308:

mysql> shutdown; Query OK, 0 rows affected (0.01 sec)

至此MGR上的所有的节点全部down掉。

我们现在把3个节点全部起来:

[jason@VM-24-9-centos ~]$ nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql3306/my3306.cnf > /dev/null --user=jason 2>&1 & [1] 26839 [jason@VM-24-9-centos ~]$ nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3307/my3307.cnf > /dev/null --user=jason 2>&1 & [2] 26840 [jason@VM-24-9-centos ~]$ nohup /opt/mysql/8/bin/mysqld_safe --defaults-file=/data/mysql_mgr3308/my3308.cnf > /dev/null --user=jason 2>&1 & [3] 29051

我们登陆各个节点查看一下GTID的值:

mysql> select @@port ,@@gtid_executed; +--------+--------------------------------------------------------------------------------------+ | @@port | @@gtid_executed | +--------+--------------------------------------------------------------------------------------+ | 3306 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13, b965e3bb-be27-11ec-ba80-525400743c33:1-56 | +--------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> select @@port ,@@gtid_executed; +--------+--------------------------------------------------------------------------------------+ | @@port | @@gtid_executed | +--------+--------------------------------------------------------------------------------------+ | 3307 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13, b965e3bb-be27-11ec-ba80-525400743c33:1-57 | +--------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) mysql> select @@port ,@@gtid_executed; +--------+--------------------------------------------------------------------------------------+ | @@port | @@gtid_executed | +--------+--------------------------------------------------------------------------------------+ | 3308 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13, b965e3bb-be27-11ec-ba80-525400743c33:1-58 | +--------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

MGR各个节点的GITD:

3308->b965e3bb-be27-11ec-ba80-525400743c33:1-58
3307->b965e3bb-be27-11ec-ba80-525400743c33:1-57
3306->b965e3bb-be27-11ec-ba80-525400743c33:1-56

很明显3308是集群中最后的主库,其GTID的值为最大: 1-58

这个时候,我们人为失误把 3306 这个节点作为了主节点: 来引导 group_replication_bootstrap_group:

mysql> set global group_replication_bootstrap_group = ON; Query OK, 0 rows affected (0.00 sec) mysql> start group_replication; Query OK, 0 rows affected (3.14 sec) mysql> set global group_replication_bootstrap_group = OFF; Query OK, 0 rows affected (0.00 sec)

我们从日志中可以看到,这个时候 3306 这个节点被选为了主节点:

2022-04-20T14:04:27.161451+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member gtid_executed: 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-57'
2022-04-20T14:04:27.161470+08:00 0 [Note] [MY-013519] [Repl] Plugin group_replication reported: 'Elected primary member applier channel received_transaction_set: 9915aff5-bbd6-11ec-9110-525400743c33:1-12,
b965e3bb-be27-11ec-ba80-525400743c33:1-28:32:36-39:46-49:53-57'
2022-04-20T14:04:28.161566+08:00 0 [Note] [MY-011507] [Repl] Plugin group_replication reported: 'A new primary with address 82.156.200.136:3306 was elected. The new primary will execute all previous group transactions before allowing writes.'
2022-04-20T14:04:28.162211+08:00 24 [Note] [MY-011485] [Repl] Plugin group_replication reported: 'Primary had applied all relay logs, disabled conflict detection.'

这个时候我们再启动GTID最高的 3308 这个节点:

mysql> start group_replication; Query OK, 0 rows affected (4.06 sec)

启动报错:
ERROR 3092 (HY000): The server is not configured properly to be an active member of the group. Please see more details on error log.
查看具体的日志信息是

2022-04-20T14:05:29.120482+08:00 0 [ERROR] [MY-011526] [Repl] Plugin group_replication reported: ‘This member has more executed transactions than those present in the group. Local transactions: 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-58 > Group transactions: 9915aff5-bbd6-11ec-9110-525400743c33:1-13,
b965e3bb-be27-11ec-ba80-525400743c33:1-57’
2022-04-20T14:05:29.120514+08:00 0 [ERROR] [MY-011522] [Repl] Plugin group_replication reported: ‘The member contains transactions not present in the group. The member will now exit the group.’

这个报错信息已经很明显了, 从节点的GTID 1-58 大于 主节点的GTID 1-57。

如果数据可以忽略掉,我们可以在主库3306 上手动的设置GTID_NEXT跳到 1-58 这个值:

mysql> set GTID_NEXT='b965e3bb-be27-11ec-ba80-525400743c33:58'; Query OK, 0 rows affected (0.00 sec) mysql> begin; Query OK, 0 rows affected (0.00 sec) mysql> commit; Query OK, 0 rows affected (0.00 sec) mysql> set GTID_NEXT='AUTOMATIC'; Query OK, 0 rows affected (0.00 sec) mysql> select @@port ,@@gtid_executed; +--------+--------------------------------------------------------------------------------------+ | @@port | @@gtid_executed | +--------+--------------------------------------------------------------------------------------+ | 3306 | 9915aff5-bbd6-11ec-9110-525400743c33:1-13, b965e3bb-be27-11ec-ba80-525400743c33:1-58 | +--------+--------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)

我们再次重启3308 这个节点:

mysql> start group_replication; Query OK, 0 rows affected (4.39 sec) mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | PRIMARY | 8.0.20 | | group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | SECONDARY | 8.0.20 | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 2 rows in set (0.01 sec)

这次可以顺利启动成功, MGR集群中也恢复了正常。

我们另外一个节点也启动起来:

mysql> start group_replication; Query OK, 0 rows affected (4.54 sec) mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ | group_replication_applier | 519dc1a6-be2b-11ec-9f85-525400743c33 | 82.156.200.136 | 3307 | ONLINE | SECONDARY | 8.0.20 | | group_replication_applier | 9915aff5-bbd6-11ec-9110-525400743c33 | 82.156.200.136 | 3306 | ONLINE | PRIMARY | 8.0.20 | | group_replication_applier | 9aa1cbcd-be2c-11ec-90a8-525400743c33 | 82.156.200.136 | 3308 | ONLINE | SECONDARY | 8.0.20 | +---------------------------+--------------------------------------+----------------+-------------+--------------+-------------+----------------+ 3 rows in set (0.00 sec)

目前只是看起来很集群很正常,但是数据在不同的节点上已经存在了差异。

在数据一致性上,是不能够被接受的。

mysql> select @@port; +--------+ | @@port | +--------+ | 3306 | +--------+ 1 row in set (0.00 sec) mysql> select * from mgrdb.t_mgr; +----+------------------+ | id | msg | +----+------------------+ | 1 | before mgr crash | +----+------------------+ 1 row in set (0.00 sec) mysql> select @@port; +--------+ | @@port | +--------+ | 3308 | +--------+ 1 row in set (0.00 sec) mysql> select * from mgrdb.t_mgr; +----+--------------------+ | id | msg | +----+--------------------+ | 1 | before mgr crash | | 2 | one node shutdown | | 3 | two nodes shutdown | +----+--------------------+ 3 rows in set (0.00 sec)

所以手动指定GTID_NEXT,然后跳过空事务,造成了数据的不一致,对后续的业务测试也会产生不可预知的影响。

最后的结论: MGR 集群down机后(人工正常维护或者是异常crach),正确的方式必须确定在GTID是最大的节点上来引导 group_replication_bootstrap_group 最为主节点, 从而保证各个节点数据的一致性。

今天也在官方文档上得到了证实: https://dev.mysql.com/doc/refman/8.0/en/group-replication-restarting-group.html
大家可以参考一下,官网是最严谨的。

最后修改时间:2022-04-21 14:37:38
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论