方案背景
涉及项目:XXXXXX银行项目
涉及功能:自动化运维平台
优化项:mysql数据库高可用架构
规划vs目标
主机:192.168.31.113、192.168.31.114
操作系统:CentOS7.6
中间件及其版本:mysql5.7、keepalived-2.1.2
操作用户:root、shsnc
2.2 实现目标
实施过程
3.1 建立mysql数据库双主并验证
主机:192.168.31.113、192.168.31.114 操作用户:shsnc
1)192.168.31.113 mysql配置
修改mysql配置文件添加如下配置:
vim my.cnf
server-id=113
log-bin=mysql-bin
enforce-gtid-consistency=true
gtid-mode=on
重启mysql。
连接mysql并配置mysql双主:
mysql -h192.168.31.113 -uroot -P3306 -p'shsnc!@#'
stop slave;
change master to master_host='135.10.110.114',master_user='root',master_password='shsnc!@#',master_port=3306,master_auto_position=1;
start slave;
2)192.168.31.114 mysql配置
修改mysql配置文件添加如下配置:
vim my.cnf
server-id=114
log-bin=mysql-bin
enforce-gtid-consistency=true
gtid-mode=on
重启Mysql。
连接mysql并配置mysql双主:
mysql -h192.168.31.114 -uroot -P3306 -p'shsnc!@#'
stop slave;
change master to master_host='135.10.110.113',master_user='root',master_password='shsnc!@#',master_port=3306,master_auto_position=1;
start slave;
3)验证mysql双主
配置完成后在分别连接两台mysql,查看双主状态:
show slave status\G;


Slave_IO_Running: Yes
Slave_SQL_Running: Yes
确认双方mysql的slave状态均为双yes,主主搭建成功。
3.2 搭建keepalived(两台)
主机:192.168.31.113、192.168.31.114
操作用户:root
--安装keepalived:
安装编译依赖包
yum install -y libnl;
yum install -y libnfnetlink-devel zlib zlib-devel gcc gcc-c++ openssl openssl-devel openssh;
yum install -y bc;
解压安装包
tar zxvf keepalived-2.1.2.tar.gz
编译安装
cd keepalived-2.1.2
mkdir /home/shsnc/keepalived
./configure --prefix=/home/shsnc/keepalived
make
makeinstall
复制配置文件
复制keepalived配置文件目录到etc下,用于systemctl的keepalived服务启动。
cp -r home/shsnc/keepalived/etc/keepalived etc/keepalived
启动keepalived服务
systemctl启动keepalived服务
systemctl start keepalived
3.3 编写keepalived脚本
1)编写check脚本
作用:检查mysql是否可用,不可用则停止本机keepalived,使虚拟ip转移至可用mysql。
vim etc/keepalived/script/check_mysql.sh
#!/bin/bash
##检测mysql实例端口是否通
RETVAL=$?
# failover
check_port=$(ss -nlpt | grep "mysqld" | awk -F'[: ]+' '/3306/{print $6}')
if [ ! -n "${check_port}" ]
then
/bin/systemctl stop keepalived
else
#exit $RETVAL
echo ' mysql is alive'
fi
cpuidle=$(vmstat 1 3 |tail -1 |awk '$0~/[[:digit:]]+/{print $(NF-2)}')
cpu_used=$(awk -v x=$cpuidle 'BEGIN{printf "%.2f\n",100-x}')
failpoint=93.0
if [ $(echo "${cpu_used} >= $failpoint" | bc ) -eq 1 ]
then
/bin/systemctl stop keepalived
else
echo "CPU utilization does not exceed 93"
exit $RETVAL
fi
exit 0
2)编写notify脚本
作用:设置切换后的mysql主库可读写,从库只读,确保数据写入的唯一入口。
vim etc/keepalived/script/keepalived_notify.sh
#!/bin/bash
. ~/.bash_profile
#mysql user
DB_USER="root"
#mysql user password
DB_PASSWORD='!QAZ3wsx@gzyd'
MYSQL_SOCK=""
#mysql_bin
MYSQL_BIN="/data/mysql/mysql_5737/bin/mysql"
#mysql client command
MYSQL_CMD="${MYSQL_BIN} -u${DB_USER} -p${DB_PASSWORD}"
#query the killed seesion's id sql
MYSQL_SQL="select concat('kill ',id,';') from information_schema.processlist where user not in ('system user','repl','replic','backup','bkpuser','bomcjk','root','myrobot')"
#define function: get mysql service information
function get_mysql_infor()
{
for sock in `ps -ef | grep mysqld | grep '\-\-socket=' | awk -F'--socket=' '{print $2}' | awk '{print $1}'`
do
MYSQL_SOCK="${MYSQL_SOCK}$(echo $sock)"
done
}
#define function: mysql kill sessions
function kill_sessions()
{
#receive a mysql socket file parameter
#my_sock=$(get_mysql_infor)
unset MYSQL_SOCK
get_mysql_infor
ln -sv ${MYSQL_SOCK} /tmp/mysql.sock
${MYSQL_CMD} -S ${MYSQL_SOCK} -Ne "${MYSQL_SQL}" 2>/dev/null | ${MYSQL_CMD} > /dev/null 2>&1
}
#define function: set mysql read_only mode
function set_readonly()
{
#receive mysql socket file && read_only sign parameter
#my_sock=$(get_mysql_infor)
unset MYSQL_SOCK
get_mysql_infor
my_sign=$1
#begin to set mysql read_only mode
#${MYSQL_CMD} -S ${MYSQL_SOCK} -Ne "set global read_only=${my_sign}" 2>/dev/null
${MYSQL_CMD} -S ${MYSQL_SOCK} -Ne "SET GLOBAL sync_binlog=1;SET GLOBAL innodb_flush_log_at_trx_commit=1;SET GLOBAL read_only=${my_sign};SET GLOBAL super_read_only=${my_sign};" 2>/dev/null
}
#define function: keepalived state changed to master
function Keepalived_changed_to_master()
{
#my_sock=$(get_mysql_infor)
unset MYSQL_SOCK
get_mysql_infor
Seconds_Behind_Master=$(${MYSQL_CMD} -S ${MYSQL_SOCK} -e "show slave status\G" | grep -w Seconds_Behind_Master | awk -F": " '{print $2}')
Slave_IO_Running=$(${MYSQL_CMD} -S ${MYSQL_SOCK} -e "show slave status\G" | grep -w Slave_IO_Running | awk -F": " '{print $2}')
Slave_SQL_Running=$(${MYSQL_CMD} -S ${MYSQL_SOCK} -e "show slave status\G" | grep -w Slave_SQL_Running | awk -F": " '{print $2}')
Master_Log_File=$(${MYSQL_CMD} -S ${MYSQL_SOCK} -e "show slave status\G" | grep -w Master_Log_File | awk -F": " '{print $2}')
Relay_Master_Log_File=$({MYSQL_CMD} -S ${MYSQL_SOCK} -e "show slave status\G" | grep -w Relay_Master_Log_File | awk -F": " '{print $2}')
Read_Master_Log_Pos=$({MYSQL_CMD} -S ${MYSQL_SOCK} -e "show slave status\G" | grep -w Read_Master_Log_Pos | awk -F": " '{print $2}')
Exec_Master_Log_Pos=$({MYSQL_CMD} -S ${MYSQL_SOCK} -e "show slave status\G" | grep -w Exec_Master_Log_Pos | awk -F": " '{print $2}')
echo "正常双YES的情况下切换"
if [ "${Slave_IO_Running}" = "Yes" -a "${Slave_SQL_Running}" = "Yes" ]
then
if [ $Seconds_Behind_Master -eq 0 ]
then
set_readonly 0
exit 0
fi
fi
#if [ ${Slave_IO_Running} = 'Connecting' -a ${Slave_SQL_Running} = 'Yes' -a ${Seconds_Behind_Master} == 'NULL' ];
if [ "${Slave_IO_Running}" = "Connecting" -a "${Slave_SQL_Running}" = "Yes" ]
then
if [ "${Master_Log_File}" = "${Relay_Master_Log_File}" -a "${Read_Master_Log_Pos}" = "${Exec_Master_Log_Pos}" ]
then
set_readonly 0
exit 0
fi
fi
}
#define function: keepalived state changed to backup
function Keepalived_changed_to_backup()
{
#set mysql read_only mode
set_readonly 1
#kill mysql sessions
kill_sessions
}
#start this shell
case $1 in
master)
Keepalived_changed_to_master
;;
backup)
Keepalived_changed_to_backup
;;
*)
;;
esac
exit 0
3.4 配置两台keepalived
1)配置192.168.31.113的keepalived
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
shsnc@shsnc.com
}
notification_email_from smtp.163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id mysql-ha
}
vrrp_script check_mysql {
script "/etc/keepalived/script/check_mysql.sh"
interval 5
fall 3
rise 2
#timeout 60
}
vrrp_instance VI_3306 {
state BACKUP
interface ens192
virtual_router_id 188
priority 100
nopreempt
advert_int 1
authentication {
auth_type PASS
auth_pass 110120
}
virtual_ipaddress {
114.168.1.188/24 dev ens192
}
track_script {
check_mysql
}
notify_master "/etc/keepalived/script/keepalived_notify.sh master"
notify_backup "/etc/keepalived/script/keepalived_notify.sh backup"
}
2)配置192.168.31.114的keepalived
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
notification_email {
shsnc@shsnc.com
}
notification_email_from smtp.163.com
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id mysql-ha
}
vrrp_script check_mysql {
script "/etc/keepalived/script/check_mysql.sh"
interval 5
fall 3
rise 2
#timeout 60
}
vrrp_instance VI_3306 {
state BACKUP
interface ens192
virtual_router_id 188
priority 90
#nopreempt
advert_int 1
authentication {
auth_type PASS
auth_pass 110120
}
virtual_ipaddress {
114.168.1.188/24 dev ens192
}
track_script {
check_mysql
}
notify_master "/etc/keepalived/script/keepalived_notify.sh master"
notify_backup "/etc/keepalived/script/keepalived_notify.sh backup"
}
重启192.168.31.113、192.168.31.114的keepalived服务。
systemctl stop keepalived;
systemctl start keepalived;
测试
4.1 mysql故障自动切换测试
1)测试方法
模拟mysql故障,查看vip是否自动切换至可用mysql,查看mysql读写权限是否自动切换。模拟故障数据库修复后启动,查看mysql双主同步是否正常,数据是否同步。
2)测试过程
停用192.168.31.114的mysql服务,此时vip已漂移至192.168.31.113(说明keepalived的check脚本执行成功)。
查看mysql读写权限,此时192.168.31.113的mysql数据库为可读可写,192.168.31.114的mysql数据库为只读(说明keepalived的notify脚本执行成功)。


通过虚拟ip连接数据库成功,说明自动切换可用mysql成功。

故障自动切换恢复后,再次启动192.168.31.114的mysql数据库,可以看到两台数据库数据一致,数据已经同步,可继续提供高可用的mysql双主架构。


4.2 Keepalived脑裂测试
测试方法:keepalived配置不同VRRP组播通讯时间进行测试,通过禁止192.168.31.114上iptables的vrrp协议访问,触发keepalived脑裂,再分别查看脑裂触发时间。
配置策略
配置组播时间为1秒

执行命令
iptables禁止vrrp协议访问并记录时间。

查看日志
查看keepalived日志,得到脑裂触发时间。


测试结论
当advert_int配置为1的时候,脑裂触发时间2~3s。
配置策略
配置组播时间为3秒

执行命令
iptables禁止vrrp协议访问并记录时间。

查看日志
查看keepalived日志,得到脑裂触发时间。


测试结论
当advert_int配置为3的时候,脑裂触发时间2~3s。
配置策略
配置组播时间5秒。

执行命令
iptables禁止vrrp协议访问并记录时间。

查看日志
查看keepalived日志,得到脑裂触发时间。


测试结论
当advert_int配置为5的时候,脑裂触发时间10~12s。
配置策略
配置组播时间1秒,配置vrrp_garp_master_refresh为10秒。

执行命令
iptables禁止vrrp协议访问并记录时间。

查看日志分:

测试结论
当vrrp_garp_master_refresh设置为10s的时候,脑裂恢复之后主节点每间隔10S发一次包到网关。
最终测试结果
通过以上测试验证,Keepalived + MySQL双主热备方案可实现mysql数据库双主实时备份,m故障后秒级自动恢复可用,且数据不丢失,提供了数据完整、功能高可用的故障处理能力,方案可行。

本文作者:张 帅(上海新炬中北团队)
本文来源:“IT那活儿”公众号





