暂无图片
暂无图片
2
暂无图片
暂无图片
暂无图片

MYSQL MGR 之 Proxy SQL 高可用自动故障转移

原创 大表哥 2022-06-08
3152

image.png
大家好, 这次大表哥分享的是 mysql MGR 高可用故障自动识别方案 Proxy SQL.

首先,我们要清楚 原生的 mysql MGR 是 不提供故障自动识别方案的, 市面上TAF的解决方案大致分为如下:

1.官方的mysql router 的组件: 可以参考大表哥之前的文章 https://www.modb.pro/db/399249
2.开源的路由组件, 像是 Proxy SQL
3.开发自研的自动探测的脚本,可以集成在应用连接池的代码里面,实现VIP的漂移或者是DNS的域名解析切换。

今天我们要分享的是 用开源的组件 Proxy SQL 实现 MGR 的自动故障转移。

个人认为 proxy SQL 是一款十分轻量级的,可以实现快速部署的轻量级的路由组件。 我们先看一下 proxy SQL 的官方文档和下载地址:
https://proxysql.com/

image.png

Proxy SQL 下载
根据自己的OS的版本选择相应的RPM包下载:
https://github.com/sysown/proxysql/releases
Image.png

安装:

INFRA [mysql@wqdcsrv3353 ~]# sudo rpm -ivh proxysql2-2.0.14-1.1.el7.x86_64.rpm warning:proxysql2-2.0.14-1.1.el7.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID cd2efd2a: NOKEY Preparing... ################################# [100%] Updating / installing... 1:proxysql2-2.0.14-1.1.el7.x86_64 ################################# [100%] uat mysql@wqdcsrv3352[09:53:17]:/data/software $ which proxysql /bin/proxysql

我们接下来要搭建一个 proxy 的集群, 这个集群是为 mysql MGR 提供了路由的服务。 示例图如下:

Image.png

对应的测试机器如下:

image.png

这里注意的是 proxy 的集群是去中心化的设计,没有主从角色状态的概念。

我们先预先准备好一个MGR的mysql 集群:

mysql> select * from replication_group_members; +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+ | group_replication_applier | 16d296b7-83d1-11ec-8d13-005056aefbb6 | 10.67.39.149 | 3080 | ONLINE | SECONDARY | 8.0.27 | XCom | | group_replication_applier | 30af63d9-7db1-11ec-9d58-f403439dfd00 | 10.67.38.50 | 3080 | ONLINE | PRIMARY | 8.0.27 | XCom | | group_replication_applier | 89866a9b-7db1-11ec-9d58-f403439dfd00 | 10.67.39.49 | 3080 | ONLINE | SECONDARY | 8.0.27 | XCom | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+ 3 rows in set (0.01 sec)

我们需要在MGR中建立 PROXY 的心跳探测账户:

create user proxy_monitor@'10.%' identified by '***********' ; GRANT REPLICATION CLIENT ON *.* TO `proxy_monitor`@`10.%`; GRANT SELECT ON `sys`.* TO `proxy_monitor`@`10.%`

另外PROXY SQL 对 mysql mgr 的支持需要创建一些试图和函数: 这个脚本可以从官方网上拿到:

https://proxysql.com/documentation/main-runtime/#mysql_group_replication_hostgroups

Image.png

USE sys; DELIMITER $$ CREATE FUNCTION IFZERO(a INT, b INT) RETURNS INT DETERMINISTIC RETURN IF(a = 0, b, a)$$ CREATE FUNCTION LOCATE2(needle TEXT(10000), haystack TEXT(10000), offset INT) RETURNS INT DETERMINISTIC RETURN IFZERO(LOCATE(needle, haystack, offset), LENGTH(haystack) + 1)$$ CREATE FUNCTION GTID_NORMALIZE(g TEXT(10000)) RETURNS TEXT(10000) DETERMINISTIC RETURN GTID_SUBTRACT(g, '')$$ CREATE FUNCTION GTID_COUNT(gtid_set TEXT(10000)) RETURNS INT DETERMINISTIC BEGIN DECLARE result BIGINT DEFAULT 0; DECLARE colon_pos INT; DECLARE next_dash_pos INT; DECLARE next_colon_pos INT; DECLARE next_comma_pos INT; SET gtid_set = GTID_NORMALIZE(gtid_set); SET colon_pos = LOCATE2(':', gtid_set, 1); WHILE colon_pos != LENGTH(gtid_set) + 1 DO SET next_dash_pos = LOCATE2('-', gtid_set, colon_pos + 1); SET next_colon_pos = LOCATE2(':', gtid_set, colon_pos + 1); SET next_comma_pos = LOCATE2(',', gtid_set, colon_pos + 1); IF next_dash_pos < next_colon_pos AND next_dash_pos < next_comma_pos THEN SET result = result + SUBSTR(gtid_set, next_dash_pos + 1, LEAST(next_colon_pos, next_comma_pos) - (next_dash_pos + 1)) - SUBSTR(gtid_set, colon_pos + 1, next_dash_pos - (colon_pos + 1)) + 1; ELSE SET result = result + 1; END IF; SET colon_pos = next_colon_pos; END WHILE; RETURN result; END$$ CREATE FUNCTION gr_applier_queue_length() RETURNS INT DETERMINISTIC BEGIN RETURN (SELECT sys.gtid_count( GTID_SUBTRACT( (SELECT Received_transaction_set FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier' ), (SELECT @@global.GTID_EXECUTED) ))); END$$ CREATE FUNCTION gr_member_in_primary_partition() RETURNS VARCHAR(3) DETERMINISTIC BEGIN RETURN (SELECT IF( MEMBER_STATE='ONLINE' AND ((SELECT COUNT(*) FROM performance_schema.replication_group_members WHERE MEMBER_STATE != 'ONLINE') >= ((SELECT COUNT(*) FROM performance_schema.replication_group_members)/2) = 0), 'YES', 'NO' ) FROM performance_schema.replication_group_members JOIN performance_schema.replication_group_member_stats USING(member_id)); END$$ CREATE VIEW gr_member_routing_candidate_status AS SELECT sys.gr_member_in_primary_partition() as viable_candidate, IF( (SELECT (SELECT GROUP_CONCAT(variable_value) FROM performance_schema.global_variables WHERE variable_name IN ('read_only', 'super_read_only')) != 'OFF,OFF'), 'YES', 'NO') as read_only, sys.gr_applier_queue_length() as transactions_behind, Count_Transactions_in_queue as 'transactions_to_cert' from performance_schema.replication_group_member_stats;$$ DELIMITER ;

执行完脚本之后,我们可以验证一下创建的对象:

mysql> use sys; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed mysql> select gr_member_in_primary_partition(); +----------------------------------+ | gr_member_in_primary_partition() | +----------------------------------+ | YES | +----------------------------------+ 1 row in set (0.00 sec) mysql> select * from gr_member_routing_candidate_status; +------------------+-----------+---------------------+----------------------+ | viable_candidate | read_only | transactions_behind | transactions_to_cert | +------------------+-----------+---------------------+----------------------+ | YES | NO | 0 | 0 | +------------------+-----------+---------------------+----------------------+ 1 row in set (0.01 sec)

下面我们搭建 proxy SQL 的 集群:

1)配置文件参数
完整的参数文件配置: 每个 proxy SQL 的节点的配置文件是一致的, 对于proxy sql 的集群来说,没有主从的角色之分

datadir="/data/mysql3080/proxySQL" admin_variables= { admin_credentials="admin:admin;proxy_cluster:*****" mysql_ifaces="0.0.0.0:3081" cluster_username="proxy_cluster" cluster_password="******" cluster_check_interval_ms=200 cluster_check_status_frequency=100 cluster_mysql_query_rules_save_to_disk=true cluster_mysql_servers_save_to_disk=true cluster_mysql_users_save_to_disk=true cluster_proxysql_servers_save_to_disk=true cluster_mysql_query_rules_diffs_before_sync=10 cluster_mysql_servers_diffs_before_sync=10 cluster_mysql_users_diffs_before_sync=10 cluster_proxysql_servers_diffs_before_sync=10 } proxysql_servers = ({hostname="10.67.38.50",port=3081,weight=0},{hostname="10.67.39.49",port=3081,weight=0}) mysql_variables= { threads=8 max_connections=5000 default_query_delay=0 have_compress=true poll_timeout=2000 interfaces="0.0.0.0:3082" default_schema="information_schema" stacksize=1048576 server_version="8.0.20" default_query_timeout=7200000 connect_timeout_server=3000 monitor_username="proxy_monitor" monitor_password="admin123" monitor_history=600000 monitor_connect_interval=2000 monitor_ping_interval=2000 monitor_read_only_interval=2000 monitor_read_only_timeout=500 ping_interval_server_msec=120000 ping_timeout_server=500 commands_stats=true sessions_sort=true connect_retries_on_failure=10 default_charset="utf8mb4" threshold_query_length=524288 threshold_resultset_size=524288 max_allowed_packet=67108864 max_transaction_time=14400000 monitor_replication_lag_interval=10000 eventslog_filename="/data/mysql3080/log/proxysql.log" eventslog_format=2 auditlog_filename="/data/mysql3080/log/audit.log" } mysql_servers = ({address="10.67.38.50",port=3080,hostgroup=20, max_connections=5000,max_replication_lag=2000},{address="10.67.39.49",port=3080,hostgroup=40, max_connections=5000,max_replication_lag=2000},{address="10.67.39.149",port=3080,hostgroup=40, max_connections=5000,max_replication_lag=2000}) mysql_users: ( {username = "db_monitor",password = "admin123",default_hostgroup = 20,max_connections=50,default_schema="information_schema",active = 1}) mysql_query_rules: ({rule_id=1,active=1,match_digest=".",apply=0,log=1} ) scheduler= ( ) mysql_replication_hostgroups= ( ) mysql_group_replication_hostgroups: ( {writer_hostgroup=20, backup_writer_hostgroup=30, reader_hostgroup=40, offline_hostgroup=10,active=1,max_writers=1,writer_is_also_reader=0,max_transactions_behind=4000} )

配置文件中重要的参数含义如下:

a)proxy SQL 集群的配置

admin_credentials="admin:admin;proxy_cluster:*****" ### 1)amdin/admin 是proxy SQL 后台管理员的账户和密码, 2)proxy SQL 集群是通过 proxy_cluster这个账户来实现配置文件的定期同步的 mysql_ifaces="0.0.0.0:3081" --proxy SQL 管理员后台登录的端口 cluster_username="proxy_cluster" --proxy SQL 集群的账户 cluster_password="*****" --proxy SQL 集群的密码 --下面是proxy SQL 集群同步检查相关的一些参数 cluster_check_interval_ms=200 cluster_check_status_frequency=100 cluster_mysql_query_rules_save_to_disk=true cluster_mysql_servers_save_to_disk=true cluster_mysql_users_save_to_disk=true cluster_proxysql_servers_save_to_disk=true cluster_mysql_query_rules_diffs_before_sync=10 cluster_mysql_servers_diffs_before_sync=10 cluster_mysql_users_diffs_before_sync=10 cluster_proxysql_servers_diffs_before_sync=10 --proxy 集群成员的配置 proxysql_servers = ({hostname="10.67.38.50",port=3081,weight=0},{hostname="10.67.39.49",port=3081,weight=0})

b)Mysql server 的配置

mysql_variables= { max_connections=5000 --最大连接数 interfaces="0.0.0.0:3082" -- 对应用系统暴露的端口 monitor_username="proxy_monitor" -- 连接mysql mgr 集群的探测心跳账号 monitor_password="********" ... ... } -- MGR 集群的信息,其中 hostgroup = 20 是主库, hostgroup = 40 表示从库 mysql_servers = ({address="10.67.38.50",port=3080,hostgroup=20, max_connections=5000,max_replication_lag=2000}, {address="10.67.39.49",port=3080,hostgroup=40, max_connections=5000,max_replication_lag=2000}, {address="10.67.39.149",port=3080,hostgroup=40, max_connections=5000,max_replication_lag=2000} ) -- 定义 MGR 的 hostgroup: 20:写组, 30:备份写组, 40:读组,10:离线组 mysql_group_replication_hostgroups: ( {writer_hostgroup=20, backup_writer_hostgroup=30, reader_hostgroup=40, offline_hostgroup=10,active=1,max_writers=1,writer_is_also_reader=0,max_transactions_behind=4000} )

c)其他的一些基本的配置

datadir="/data/mysql3080/proxySQL" -- proxy SQL 的数据目录,首次启动的时候,会在这个目录下生成相应的数据文件,日志文件
uat mysql@wqdcsrv3352[13:52:26]:/data/mysql3080/proxySQL $ ls -lhtr total 392K -rw------- 1 mysql mysql 1.7K Jun 6 13:39 proxysql-key.pem -rw------- 1 mysql mysql 1.1K Jun 6 13:39 proxysql-ca.pem -rw------- 1 mysql mysql 1.1K Jun 6 13:39 proxysql-cert.pem -rw-r--r-- 1 mysql mysql 7 Jun 6 13:39 proxysql.pid -rw------- 1 mysql mysql 192K Jun 6 13:39 proxysql.db -rw------- 1 mysql mysql 19K Jun 6 13:49 proxysql.log -rwxr-xr-x 1 mysql mysql 2.6K Jun 6 13:52 proxy_mysql3080.cnf -rw------- 1 mysql mysql 160K Jun 6 14:21 proxysql_stats.db

2)创建proxy SQL 的路径并启动 (2个HA的节点都需要执行同样的命令)

mkdir -p /data/mysql3080/proxySQL

启动的时候需要制定一下: proxy SQL的data 目录以及配置文件

/usr/bin/proxysql -c /data/mysql3080/proxySQL/proxy_mysql3080.cnf -D /data/mysql3080/proxySQL

3)登录并验证proxy SQL 的集群 默认的管理员账户是 admin/admin, 管理员的端口是 3081

uat mysql@wqdcsrv3353[14:40:47]:/data/mysql3080/proxySQL $ /opt/mysql/product/percona8.0/bin/mysql -h 127.0.0.1 -P 3081 -u admin -padmin

查看2个 proxy SQL 集群的信息:

mysql> select * from proxysql_servers; +-------------+------+--------+---------+ | hostname | port | weight | comment | +-------------+------+--------+---------+ | 10.67.38.50 | 3081 | 0 | | | 10.67.39.49 | 3081 | 0 | | +-------------+------+--------+---------+ 2 rows in set (0.00 sec)

查看 mysql MGR 的server 信息 :

mysql> select * from mysql_servers; +--------------+--------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | hostgroup_id | hostname | port | gtid_port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+--------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ | 20 | 10.67.38.50 | 3080 | 0 | ONLINE | 1 | 0 | 5000 | 2000 | 0 | 0 | | | 40 | 10.67.39.49 | 3080 | 0 | ONLINE | 1 | 0 | 5000 | 2000 | 0 | 0 | | | 40 | 10.67.39.149 | 3080 | 0 | ONLINE | 1 | 0 | 5000 | 2000 | 0 | 0 | | +--------------+--------------+------+-----------+--------+--------+-------------+-----------------+---------------------+---------+----------------+---------+ 3 rows in set (0.00 sec)

如果proxy SQL 启动失败, 请查看集群的日志信息:

uat mysql@wqdcsrv3352[14:47:42]:/data/mysql3080/proxySQL $ view proxysql.log

如果有proxy SQL 节点挂掉或者网络问题的话,会报错:

2022-06-06 14:50:58 ProxySQL_Cluster.cpp:217:ProxySQL_Cluster_Monitor_thread(): [WARNING] Cluster: unable to connect to peer 10.67.39.49:3081 . Error: Can't connect to MySQL server on '10.67.39.49' (115)
2022-06-06 14:50:58 ProxySQL_Cluster.cpp:217:ProxySQL_Cluster_Monitor_thread(): [WARNING] Cluster: unable to connect to peer 10.67.39.49:3081 . Error: Can't connect to MySQL server on '10.67.39.49' (115)

接下来,我们用 java (spring boot) 写一个简单的小代码片段,模拟一直在读写数据库:

在数据库的读写过程中,我们模拟2个的故障自动转移的场景:

1)MGR 主节点down掉
2)Proxy SQL 集群挂掉一个节点

我们来准备一下,我们的小程序片段:

首先我们需要同时在 proxy SQL 和 mysql 中创建我们的账户:

登录 Mysql MGR 的主节点 :

创建数据库 :testdb
用户名/密码: app_test/app_test

mysql> create database testdb; Query OK, 1 row affected (0.01 sec) mysql> create user app_test@'10.%' identified with mysql_native_password by 'app_test'; Query OK, 0 rows affected (0.01 sec) mysql> grant all on testdb.* to app_test@'10.%'; Query OK, 0 rows affected (0.01 sec) -- 这个加密的密码字符串是给下面 proxy SQL 手动插入记录时 准备的值 mysql> select authentication_string from mysql.user where user = 'app_test'; +-------------------------------------------+ | authentication_string | +-------------------------------------------+ | *8832AB64049091E6C9A1B921E55CC2A8FAF2E31C | +-------------------------------------------+ 1 row in set (0.00 sec)

登录proxy SQL 客户端, 创建对应的 proxy SQL的 user :

mysql> insert into mysql_users (username,password,default_schema,default_hostgroup,max_connections) values ('app_test','*8832AB64049091E6C9A1B921E55CC2A8FAF2E31C','testdb',20,200) -> ; Query OK, 1 row affected (0.00 sec) mysql> load mysql users to runtime; Query OK, 0 rows affected (0.00 sec) mysql> save mysql users to disk; Query OK, 0 rows affected (0.01 sec)

接下来,我们需要在 application.properties 配置数据库的连接信息:
注意这里我们要写的是 proxy SQL 的连接地址:

spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver spring.datasource.url=jdbc:mysql://10.67.38.50:3082,10.67.39.49:3082/testdb?serverTimezone=UTC&useUnicode=true&characterEncoding=utf8 spring.datasource.username=app_test spring.datasource.password=app_test spring.datasource.maximum-pool-size=200 -- 设置最大连接200 spring.datasource.min-idle=10 -- 最小空闲连接

我们启动 springboot 的程序之后:

Image.png

由于我们设置了数据库连接池的最小连接数是 10, 我们可以观察到 mysql mgr 的主节点 和 proxy SQL 中间件中均建立了10个连接:

Proxy SQL:

mysql> show processlist; +-----------+----------+--------+-----------+---------+---------+------+ | SessionID | user | db | hostgroup | command | time_ms | info | +-----------+----------+--------+-----------+---------+---------+------+ | 5 | app_test | testdb | 20 | Sleep | 1228988 | NULL | | 6 | app_test | testdb | 20 | Sleep | 1229159 | NULL | | 7 | app_test | testdb | 20 | Sleep | 1229090 | NULL | | 8 | app_test | testdb | 20 | Sleep | 1229068 | NULL | | 9 | app_test | testdb | 20 | Sleep | 1229098 | NULL | | 10 | app_test | testdb | 20 | Sleep | 1229073 | NULL | | 11 | app_test | testdb | 20 | Sleep | 1229004 | NULL | | 12 | app_test | testdb | 20 | Sleep | 1228984 | NULL | | 13 | app_test | testdb | 20 | Sleep | 1229013 | NULL | | 14 | app_test | testdb | 20 | Sleep | 1228997 | NULL | +-----------+----------+--------+-----------+---------+---------+------+ 10 rows in set (0.01 sec)

Mysql MGR的主节点:

mysql> select * from PROCESSLIST where user = 'app_test'; +------+----------+-------------------+--------+---------+------+-------+------+---------+-----------+---------------+ | ID | USER | HOST | DB | COMMAND | TIME | STATE | INFO | TIME_MS | ROWS_SENT | ROWS_EXAMINED | +------+----------+-------------------+--------+---------+------+-------+------+---------+-----------+---------------+ | 7832 | app_test | 10.67.38.50:43890 | testdb | Sleep | 1617 | | NULL | 1616551 | 1 | 1 | | 7833 | app_test | 10.67.38.50:43892 | testdb | Sleep | 1617 | | NULL | 1616530 | 1 | 1 | | 7825 | app_test | 10.67.38.50:43874 | testdb | Sleep | 1617 | | NULL | 1616804 | 1 | 1 | | 7834 | app_test | 10.67.38.50:43894 | testdb | Sleep | 1617 | | NULL | 1616511 | 1 | 1 | | 7826 | app_test | 10.67.38.50:43876 | testdb | Sleep | 1617 | | NULL | 1616676 | 1 | 1 | | 7827 | app_test | 10.67.38.50:43878 | testdb | Sleep | 1617 | | NULL | 1616656 | 1 | 1 | | 7828 | app_test | 10.67.38.50:43880 | testdb | Sleep | 1617 | | NULL | 1616635 | 1 | 1 | | 7829 | app_test | 10.67.38.50:43882 | testdb | Sleep | 1617 | | NULL | 1616615 | 1 | 1 | | 7830 | app_test | 10.67.38.50:43886 | testdb | Sleep | 1617 | | NULL | 1616591 | 1 | 1 | | 7831 | app_test | 10.67.38.50:43888 | testdb | Sleep | 1617 | | NULL | 1616570 | 1 | 1 | +------+----------+-------------------+--------+---------+------+-------+------+---------+-----------+---------------+ 10 rows in set (0.00 sec)

我们写几行简单的代码,输出一下 mysql 的 report_host 变量,看一下 当前连接的mysql 的实例的 IP 地址:

Image.png

浏览器的地址是输入测试 : http://127.0.0.1:8066/
Image.png

我们模拟一下主库 10.67.38.50 down 机, 我们去shutdown 一下 mysql :

mysql> shutdown; Query OK, 0 rows affected (0.00 sec)

我们再次刷新浏览器的地址: http://127.0.0.1:8066/

Image.png

这个时候,我们观察应用的日志,会有连接失效的警告:正好是连接池里面10个失效的连接,符合我们的预期

2022-06-07 09:41:37,691 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@3a9211b1 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,713 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@72ab37ad (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,731 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@e952d6f (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,750 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@1ca54772 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,768 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@f5b9fe6 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,788 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@4c444391 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,808 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@436dde80 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,827 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@37064435 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,846 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@5c50386a (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:41:37,864 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@6ff2e9f (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.

我们登陆数据库查看: 果然 10.67.39.149 是现在的主库。

mysql> select * from replication_group_members; +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+ | group_replication_applier | 16d296b7-83d1-11ec-8d13-005056aefbb6 | 10.67.39.149 | 3080 | ONLINE | PRIMARY | 8.0.27 | XCom | | group_replication_applier | 89866a9b-7db1-11ec-9d58-f403439dfd00 | 10.67.39.49 | 3080 | ONLINE | SECONDARY | 8.0.27 | XCom | +---------------------------+--------------------------------------+--------------+-------------+--------------+-------------+----------------+----------------------------+ 2 rows in set (0.00 sec)

接下来我们在测试一下 proxy SQL 的HA, 我们需要手动关闭 proxy SQL 的节点 :

spring.datasource.url=jdbc:mysql://10.67.38.50:3082,10.67.39.49:3082/testdb?serverTimezone=UTC&useUnicode=true&characterEncoding=utf8

我们来关闭一下连接串中 第一个 proxySQL 的节点: 10.67.38.50

uat mysql@wqdcsrv3352[09:51:19]:~ $ ps -ef|grep proxysql | grep 3080 mysql 113953 1 0 Jun06 ? 00:00:00 /usr/bin/proxysql -c /data/mysql3080/proxySQL/proxy_mysql3080.cnf -D /data/mysql3080/proxySQL mysql 113954 113953 0 Jun06 ? 00:09:42 /usr/bin/proxysql -c /data/mysql3080/proxySQL/proxy_mysql3080.cnf -D /data/mysql3080/proxySQL uat mysql@wqdcsrv3352[09:51:27]:~ $ kill 113954 113953 uat mysql@wqdcsrv3352[09:51:35]:~ $ ps -ef|grep proxysql | grep 3080

我们尝试刷新一下浏览器: http://127.0.0.1:8066/ 依然可以访问数据库。

Image.png

这个时候,我们再次观察应用的日志,又会有连接失效的警告:正好是连接池里面10个失效的连接,再次符合我们的预期

2022-06-07 09:54:04,845 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@39d142aa (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,867 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@7aa9eace (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,886 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@18c3b7de (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,904 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@1576383b (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,921 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@1c8dcba4 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,940 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@3822d597 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,960 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@167b150d (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,978 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@6cb3d640 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:04,995 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@2904b822 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
2022-06-07 09:54:05,012 WARN (PoolBase.java:184)- HikariPool-1 - Failed to validate connection com.mysql.cj.jdbc.ha.MultiHostMySQLConnection@d07555 (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.

我们查询一下现在存活的proxy SQL 的节点上的连接数:符合我们的预期,连接已经漂移到了存活的 proxy SQL 的节点

mysql> show processlist; +-----------+----------+--------+-----------+---------+---------+------+ | SessionID | user | db | hostgroup | command | time_ms | info | +-----------+----------+--------+-----------+---------+---------+------+ | 27 | app_test | testdb | 20 | Sleep | 182321 | NULL | | 28 | app_test | testdb | 20 | Sleep | 180263 | NULL | | 29 | app_test | testdb | 20 | Sleep | 178193 | NULL | | 30 | app_test | testdb | 20 | Sleep | 176140 | NULL | | 31 | app_test | testdb | 20 | Sleep | 174718 | NULL | | 32 | app_test | testdb | 20 | Sleep | 172338 | NULL | | 33 | app_test | testdb | 20 | Sleep | 170321 | NULL | | 34 | app_test | testdb | 20 | Sleep | 168121 | NULL | | 35 | app_test | testdb | 20 | Sleep | 166129 | NULL | | 36 | app_test | testdb | 20 | Sleep | 164108 | NULL | +-----------+----------+--------+-----------+---------+---------+------+ 10 rows in set (0.00 sec)

我们总结一下,我们完成了
1) Proxy SQL 集群的搭建
2) HA 故障自动转移之 MGR 主节点 down 机
3) HA 故障自动转移之 PROXY SQL 节点 down 机

最后我想说的是,一定要结合自己公司的实际应用系统来测试 mysql HA的自动故障转移。 是 Java 还是 python, 还是GO?

Java 的话, 用的是什么持久层的协议框架?用的什么连接池(C3PO,Hikari)?

作为DBA必须要了解你的系统的持久层框架和连接池的工作机制。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
1人已赞赏
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论