应某项目需求,需要将mysql备库恢复主从同步,数据库版本为5.7.24,登录数据库检查环境,主库运行正常,备库同步已停止。
查看备库slave状态:Slave_IO_Running: No,且同步的日志主库已不存在,信息如下:
mysql> show slave status \G;*************************** 1. row *************************** Slave_IO_State: Master_Host: xxx.xx.xxx.xxx Master_User: replication Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.030596 Read_Master_Log_Pos: 154 Relay_Log_File: slave-relay-bin.018035 Relay_Log_Pos: 369 Relay_Master_Log_File: master-bin.030596 Slave_IO_Running: No Slave_SQL_Running: Yes Replicate_Do_DB: 省略部分输出…… Last_IO_Errno: 1595 Last_IO_Error: Relay log write failure: could not queue event from master Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: b88f562f-14ab-11e9-9191-0242ac110002 Master_Info_File: var/lib/mysql/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: 200616 16:45:39 Last_SQL_Error_Timestamp: 省略部分输出……1 row in set (0.00 sec)
登录主库,查看:当前的binglog日志为:master-bin.069047
mysql> show master status \G;*************************** 1. row *************************** File: master-bin.069047 Position: 98055615 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: b88f562f-14ab-11e9-9191-0242ac110002:1-11312327681 row in set (0.00 sec)
主库binlog日志保留15天,信息如下:
mysql> show variables like 'expire_logs_days';+------------------+-------+| Variable_name | Value |+------------------+-------+| expire_logs_days | 15 |+------------------+-------+
主库当前最早的binlog日志为:5月29日的master-bin.067546
mysql> show variables like 'log_bin%';+---------------------------------+---------------------------------+| Variable_name | Value |+---------------------------------+---------------------------------+| log_bin | ON || log_bin_basename | var/lib/mysql/master-bin || log_bin_index | var/lib/mysql/master-bin.index || log_bin_trust_function_creators | ON || log_bin_use_v1_row_events | OFF |+---------------------------------+---------------------------------+mysql@41341d8089d3:/var/lib/mysql$ ls -ltr master-bin.* |more-rw-r----- 1 mysql mysql 1073742835 May 29 13:38 master-bin.067546-rw-r----- 1 mysql mysql 1178992991 May 29 13:49 master-bin.067547-rw-r----- 1 mysql mysql 1073778127 May 29 14:03 master-bin.067548-rw-r----- 1 mysql mysql 1073982895 May 29 14:05 master-bin.067549-rw-r----- 1 mysql mysql 1074024190 May 29 14:07 master-bin.067550-rw-r----- 1 mysql mysql 1480555563 May 29 14:08 master-bin.067551-rw-r----- 1 mysql mysql 1073811010 May 29 14:13 master-bin.067552-rw-r----- 1 mysql mysql 1073757290 May 29 14:17 master-bin.067553-rw-r----- 1 mysql mysql 1073901502 May 29 14:20 master-bin.067554-rw-r----- 1 mysql mysql 1073801430 May 29 14:23 master-bin.067555
从上面主库、备库的查询信息看,备库需要的binlog日志,在主库上已不存在,备库要恢复主备同步,需要重新初始化数据。
1. 主、备服务器挂在NAS盘
主库数据量较大,通过innobackupex备份出的数据无法放到本地,将申请的NAS盘同时挂在到主库服务器与备库服务,免去备份文件拷贝到备库过程。
2. 备库NAS盘中解压xtrabackup工具
上传Xtrabackup软件包至NAS盘中,校验软件包的MD5值后,进行解压操作,如下:
——md5校验xxxxxx:/mnt/database-nas/slave_test # md5sum percona-xtrabackup-2.4.23-Linux-x86_64.glibc2.12.tar.gz990fca4d309cf6854253f9983499ce22 percona-xtrabackup-2.4.23-Linux-x86_64.glibc2.12.tar.gz
——解压xxxxxx:/mnt/database-nas/slave_test # tar zxvf percona-xtrabackup-2.4.23-Linux-x86_64.glibc2.12.tar.gz
3. 主、备库主机测试innobackupex可用
向环境变量PATH中添加innobackup工具的路径:
——新增变量XTRABACKUP_HOME=/mnt/database-nas/slave_test/percona-xtrabackup-2.4.23-Linux-x86_64.glibc2.12
——PATH变量修改PATH=$JAVA_HOME/bin:$PATH:$XTRABACKUP_HOME/bin
—— innobackupex命令测试innobackupex -v ##可以正常输出版本号。
4. 主库备份数据至NAS盘
备库主库数据至NAS盘中,建议在业务低峰期进行操作,主库数据量2.5T,备份程序跑了3个小时,生成备份文件1.6T。
——备份:innobackupex --defaults-file=/data/mysqlconfig/mysqld.conf --user=root --password=xxxxxx --port=3306 --host=xxx.xxx.xxx.xxx --datadir=/data/mysqldata /mnt/database-nas/mysql_full_backup
——应用redo,使备份出的数据文件处于一致性状态innobackupex --defaults-file=/data/mysqlconfig/mysqld.conf --apply-log mnt/database-nas/mysql_full_backup/2021-07-09_20-24-22/
5. 备库恢复数据
备库与主库挂载了相同的NAS盘,在备库服务器上可以直接认到挂在的NAS盘。开始在备库进行数据库恢复,命令如下:
——查看当前在运行的mysql备库容器# docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMESe12538fa0c7a registry.paas/cgos/mysql:5.7.24sec "docker-entrypoint..." 2 years ago Up 3 days 0.0.0.0:3306->3306/tcp, 33060/tcp mysql-slave
——关闭备库容器# docker stop mysql-slave
——删除备库数据目录下的文件# cd data/mysqldata/# rm -rf *
——执行恢复命令innobackupex --defaults-file=/data/mysqlconfig/mysqld.conf --datadir=/data/mysqldata --copy-back /mnt/database-nas/mysql_full_backup/2021-07-09_20-24-22
——恢复完后运行备库容器docker start mysql-slave
6. 备库恢复主从关系
——查看binlog位点xxxxxxx:/mnt/database-nas/mysql_full_backup/2021-07-09_20-24-22 # cat xtrabackup_binlog_pos_innodbmaster-bin.071809 578330673
——清除relay log信息reset slave;
——备库执行恢复主从同步脚本CHANGE MASTER TOMASTER_HOST='xxx.xx.xxx.xxx',MASTER_USER='replication',MASTER_PASSWORD='********',MASTER_PORT=3306,MASTER_LOG_FILE='master-bin.071809',MASTER_LOG_POS=578330673,MASTER_CONNECT_RETRY=10;
——查看备库slave状态show slave stauts \G;Slave_IO_Running: YesSlave_SQL_Running: Yes
主备同步已恢复,备库开始追日志。
问题1:innobackupex命令遇到:InnoDB: Error number 24 means 'Too many open files'
具体报错信息如下:
InnoDB: Operating system error number 24 in a file operation. InnoDB: Error number 24 means 'Too many open files' InnoDB: Some operating system error numbers are described at http://dev.mysql.com/doc/refman/5.7/en/operating-system-error-codes.html InnoDB: File ./wf_public_layout/pub_application_introduce_t.ibd: 'open' returned OS error 124. Cannot continue operation InnoDB: Cannot continue operation. |
core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 514100 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 514100 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited |
发现open files参数为1024,参数值为系统默认值。
解决方法:
ulimit -n 16384 ##调大open files值。
注:此方法,参数值临时生效,此次备库恢复操作过程中,除必要的设置,其他设置均为临时设置。
问题2:
主从同步复制用户密码错误
mysql数据库中用于主从复制的用户replication,业务侧密码忘记,经业务侧同意后,修改复制用户密码,过程如下:
--查看复制用户权限:mysql> show grants for 'replication'@'%';+-----------------------------------------------------+| Grants for replication@% |+-----------------------------------------------------+| GRANT REPLICATION SLAVE ON *.* TO 'replication'@'%' |+-----------------------------------------------------+--修改复制用户密码GRANT REPLICATION SLAVE ON *.* TO 'replication'@'%' identified by '*********';

更多精彩干货分享
点击下方名片关注
IT那活儿





