需求说明:公司RDS产品高可用“组件”需搭建MGR集群。(以确保节点的数据同步和一致性来保证数据的可靠性),以实现动态切换数据源。
程序大致流程:
应用程序通过数据源列表获取主节点,判断是否异常。异常包括:主节点网络异常,服务器异常,主节点手动下线或者切换。应用程序收到连接异常后,将当前主节点“剔出”集群(数据源),再从(另外两个)数据源列表获取新的主节点。
数据库环境
192.168.220.122:3306
192.168.220.128:3306
192.168.220.129:3306
-- 开发环境问题重现。使用与开发环境一致的配置和数据库版本(5.7.18)



--节点仍然可读写。问题就很严重了......程序无法走异常后的处理流程,即无法“动态”切换主节点
画外音:是不是数据库的bug ?
环境准备:
1.搭建MGR集群 ,数据库版本选用较新的 5.7.26 (利用自动化安装脚本)。如下步骤为安装MGR的过程。(有兴趣的可提供相应脚本)
[root@dba-virtual-host-220122 hotdb_cloud]# sh install.sh mysql_mgr
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
MySQL instance 3307 already exists. Please choice an unused port to use !!!
Input you will use MySQL-Instance Port:3306
Input you will use MySQL-Instance Type {p:means primary-node,s:means slave-node}:p
Input you will use MySQL-Version{5.7 or 8.0}:5.7
Input three service IP order by {local_ip cluster1_ip cluster2_ip}:192.168.220.122 192.168.220.128 192.168.220.129
2019-10-14T03:10:33.636957Z mysqld_safe Logging to '/data/mysqldata/mysql3306/log/error.log'.
2019-10-14T03:10:33.670459Z mysqld_safe Starting mysqld daemon with databases from data/mysqldata/mysql3306/mydata
Note (Code 1759): Sending passwords in plain text without SSL/TLS is extremely insecure.
Note (Code 1760): Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
Primary node group_replication install successfully
centos 7 ntpd service is running
[root@dba-virtual-host-220122 hotdb_cloud]# ll
total 2484012
-rw-r--r-- 1 root root 8608 Oct 14 10:58 install_mysql_mgr.sh
-rw-r--r-- 1 root root 5955 Oct 14 10:59 install_mysql_mm_1.sh
-rw-r--r-- 1 root root 2100 Sep 26 17:07 install_mysql_mm_2.sh
-rw-r--r-- 1 root root 9472 Oct 14 10:56 install_mysql_single.sh
-rw-r--r-- 1 root root 1220 Oct 12 14:08 install.sh
-rw-r--r-- 1 root root 312829487 Oct 12 14:07 mysql-5.6.28-linux-glibc2.5-x86_64.tar.gz
-rw-r--r-- 1 root root 644869837 Oct 14 11:09 mysql-5.7.26-linux-glibc2.12-x86_64.tar.gz
-rw-rw-rw- 1 root root 460733332 Oct 14 10:25 mysql-8.0.16-linux-glibc2.12-x86_64.tar.xz
-rw-r--r-- 1 root root 6531 Oct 12 16:50 mysql_install.package.tar.gz
-rw-r--r-- 1 root root 4423 Oct 8 14:50 my_template.cnf
-rw-r--r-- 1 root root 1009 Oct 12 15:55 ntpdate_service.sh
[root@dba-virtual-host-220122 hotdb_cloud]# ps -ef|grep mysqld
root 344 1 0 11:10 pts/2 00:00:00 bin/sh ./bin/mysqld_safe --defaults-file=/data/mysqldata/mysql3306/my.cnf.3306 --user=mysql
mysql 1766 344 1 11:10 pts/2 00:00:00 /usr/local/mysql/bin/mysqld --defaults-file=/data/mysqldata/mysql3306/my.cnf.3306 --basedir=/usr/local/mysql --datadir=/data/mysqldata/mysql3306/mydata --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/data/mysqldata/mysql3306/log/error.log --open-files-limit=65535 --pid-file=/data/mysqldata/mysql3306/sock/mysql.pid --socket=/data/mysqldata/mysql3306/sock/mysql.sock --port=3306
[root@dba-virtual-host-220122 hotdb_cloud]# mysql -uroot -p -S /data/mysqldata/mysql3306/sock/mysql.sock
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 18
Server version: 5.7.26-log MySQL Community Server (GPL)
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
root@localhost:(none) 5.7.26-log 11:11:20> show processlist;
+----+-----------------+-----------+------+---------+------+--------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+----+-----------------+-----------+------+---------+------+--------------------------------------------------------+------------------+
| 1 | event_scheduler | localhost | NULL | Daemon | 51 | Waiting on empty queue | NULL |
| 9 | system user | | NULL | Connect | 48 | executing | NULL |
| 12 | system user | | NULL | Connect | 48 | Slave has read all relay log; waiting for more updates | NULL |
| 18 | root | localhost | NULL | Query | 0 | starting | show processlist |
+----+-----------------+-----------+------+---------+------+--------------------------------------------------------+------------------+
4 rows in set (0.01 sec)
-- 节点依次加入集群
root@localhost:(none) 5.7.26-log 11:11:25> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
| group_replication_applier | 27cc7845-ee30-11e9-90e7-525400564863 | 192.168.220.122 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
1 row in set (0.00 sec)
root@localhost:(none) 5.7.26-log 11:11:43> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
| group_replication_applier | 27cc7845-ee30-11e9-90e7-525400564863 | 192.168.220.122 | 3306 | ONLINE |
| group_replication_applier | a5f5299a-ee30-11e9-ae3c-525400bdf400 | 192.168.220.128 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
2 rows in set (0.00 sec)
root@localhost:(none) 5.7.26-log 11:15:20> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
| group_replication_applier | 0632310e-ee31-11e9-a790-52540005ae6e | 192.168.220.129 | 3306 | ONLINE |
| group_replication_applier | 27cc7845-ee30-11e9-90e7-525400564863 | 192.168.220.122 | 3306 | ONLINE |
| group_replication_applier | a5f5299a-ee30-11e9-ae3c-525400bdf400 | 192.168.220.128 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-----------------+-------------+--------------+
3 rows in set (0.01 sec)
-- 查看MySQL版本
root@localhost:(none) 5.7.26-log 11:17:02> select version();
+------------+
| version() |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.00 sec)
-- 查看主节点
root@localhost:(none) 5.7.26-log 11:20:08> SELECT
-> MASTER.MEMBER_HOST
-> FROM
-> (
-> SELECT
-> MEMBER_ID,
-> CHANNEL_NAME,
-> MEMBER_HOST,
-> MEMBER_PORT,
-> MEMBER_STATE,
-> IF (global_status.VARIABLE_NAME IS NOT NULL,'PRIMARY','SECONDARY') AS MEMBER_ROLE
-> FROM
-> PERFORMANCE_SCHEMA.replication_group_members
-> LEFT JOIN PERFORMANCE_SCHEMA.global_status ON global_status.VARIABLE_NAME = 'group_replication_primary_member'
-> AND global_status.VARIABLE_VALUE = replication_group_members.MEMBER_ID
-> ) MASTER
-> WHERE
-> MASTER.MEMBER_ROLE = 'PRIMARY' AND MASTER.MEMBER_STATE = 'ONLINE' and @@super_read_only=0;
+-----------------+
| MEMBER_HOST |
+-----------------+
| 192.168.220.122 |
+-----------------+
1 row in set (0.00 sec)
root@localhost:(none) 5.7.26-log 11:21:13> create database jiale;
Query OK, 1 row affected (0.00 sec)
root@localhost:(none) 5.7.26-log 11:21:19> use jiale
Database changed
root@localhost:jiale 5.7.26-log 11:21:21> create table a(id int ,name varchar(20));
Query OK, 0 rows affected (0.01 sec)
root@localhost:jiale 5.7.26-log 11:21:33> insert into a values ('1','test');
ERROR 3098 (HY000): The table does not comply with the requirements by an external plugin.
-- MGR架构要求建表必须显示主键
root@localhost:jiale 5.7.26-log 11:21:46> alter table a modify column id int not null auto_increment primary key;
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
root@localhost:jiale 5.7.26-log 11:22:19> insert into a values ('1','test');
Query OK, 1 row affected (0.01 sec)
root@localhost:jiale 5.7.26-log 11:22:21> insert into a values ('2','testrrr');
Query OK, 1 row affected (0.00 sec)
root@localhost:jiale 5.7.26-log 11:22:28> insert into a values ('3','sstrrr');
Query OK, 1 row affected (0.00 sec)
-- 确认主节点
root@localhost:jiale 5.7.26-log 11:22:34> SELECT
-> MASTER.MEMBER_HOST
-> FROM
-> (
-> SELECT
-> MEMBER_ID,
-> CHANNEL_NAME,
-> MEMBER_HOST,
-> MEMBER_PORT,
-> MEMBER_STATE,
-> IF (global_status.VARIABLE_NAME IS NOT NULL,'PRIMARY','SECONDARY') AS MEMBER_ROLE
-> FROM
-> PERFORMANCE_SCHEMA.replication_group_members
-> LEFT JOIN PERFORMANCE_SCHEMA.global_status ON global_status.VARIABLE_NAME = 'group_replication_primary_member'
-> AND global_status.VARIABLE_VALUE = replication_group_members.MEMBER_ID
-> ) MASTER
-> WHERE
-> MASTER.MEMBER_ROLE = 'PRIMARY' AND MASTER.MEMBER_STATE = 'ONLINE' and @@super_read_only=0;
+-----------------+
| MEMBER_HOST |
+-----------------+
| 192.168.220.122 |
+-----------------+
1 row in set (0.00 sec)
-- 手动执行 stop group_replication,模拟主节点脱离集群的状态。
root@localhost:jiale 5.7.26-log 11:22:46> stop group_replication;
Query OK, 0 rows affected (9.31 sec)
-- 验证节点读写情况
root@localhost:jiale 5.7.26-log 11:23:08> select * from a;
+----+---------+
| id | name |
+----+---------+
| 1 | test |
| 2 | testrrr |
| 3 | sstrrr |
+----+---------+
3 rows in set (0.00 sec)
-- 由此可确认节点可读
root@localhost:jiale 5.7.26-log 11:23:47> insert into a values ('4','sstrrr');
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement
-- 由此可确认节点不可写(应用程序可实现动态切换数据源)
root@localhost:jiale 5.7.26-log 11:28:26> select version();
+------------+
| version() |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.00 sec)
为了确认是否是版本导致的。于是去官网查看5.7 Release note。发现5.7.20之前的版本存在此bug。
官网的解释大致是:
在以前的版本中,发行STOP GROUP_REPLICATION停止了插件,但服务器仍接受事务。这意味着交易未传输到该组。
为了STOP GROUP_REPLICATION更安全,现在 super_read_only设置为 ON发行后立即生效 STOP GROUP_REPLICATION,以确保不接受任何交易
总结:MySQL-version<5.7.20 MGR single-primary模式下,主节点停止集群后依然可读写。给应用程序动态数据源切换造成依然可靠的“幻觉”。
本文分享自微信公众号 - topdba,如有侵权,请联系 service001@enmotech.com 删除。




