问题分析解决
昨天突然服务器重启了,最后导致的就是Zabbix的数据库MYSQL库表坏了,然后MYSQL就启动不了了。启动不了咋整,看log呗,报什么异常情况,查看error如下:
2017-09-21 14:41:18 4255 [Note] InnoDB: The InnoDB memory heap is disabled 2017-09-21 14:41:18 4255 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins 2017-09-21 14:41:18 4255 [Note] InnoDB: Compressed tables use zlib 1.2.3 2017-09-21 14:41:18 4255 [Note] InnoDB: CPU does not support crc32 instructions 2017-09-21 14:41:18 4255 [Note] InnoDB: Using Linux native AIO 2017-09-21 14:41:18 4255 [Note] InnoDB: Initializing buffer pool, size = 256.0M 2017-09-21 14:41:18 4255 [Note] InnoDB: Completed initialization of buffer pool 2017-09-21 14:41:18 4255 [Note] InnoDB: Highest supported file format is Barracuda. 2017-09-21 14:41:18 4255 [Note] InnoDB: Log scan progressed past the checkpoint lsn 491181006779 2017-09-21 14:41:18 4255 [Note] InnoDB: Database was not shutdown normally! 2017-09-21 14:41:18 4255 [Note] InnoDB: Starting crash recovery. 2017-09-21 14:41:18 4255 [Note] InnoDB: Reading tablespace information from the .ibd files... 2017-09-21 14:41:18 4255 [ERROR] InnoDB: Attempted to open a previously opened tablespace. Previous tablespace mysql/slave_master_info uses space ID: 4 at filepath: ./mysql/slave_master_info.ibd. Cannot open tablespace zabbix/groups which uses space ID: 4 at filepath: ./zabbix/groups.ibd 2017-09-21 14:41:18 7fa123271720 InnoDB: Operating system error number 2 in a file operation. InnoDB: The error means the system cannot find the path specified. InnoDB: If you are installing InnoDB, remember that you must create InnoDB: directories yourself, InnoDB does not create them. InnoDB: Error: could not open single-table tablespace file ./zabbix/groups.ibd InnoDB: We do not continue the crash recovery, because the table may become InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it. InnoDB: To fix the problem and start mysqld: InnoDB: 1) If there is a permission problem in the file and mysqld cannot InnoDB: open the file, you should modify the permissions. InnoDB: 2) If the table is not needed, or you can restore it from a backup, InnoDB: then you can remove the .ibd file, and InnoDB will do a normal InnoDB: crash recovery and ignore that table. InnoDB: 3) If the file system or the disk is broken, and you cannot remove InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf InnoDB: and force InnoDB to continue crash recovery here. 170921 14:41:18 mysqld_safe mysqld from pid file data/appData/mysql/zabbix_server.pid ended 170921 14:43:16 mysqld_safe Starting mysqld daemon with databases from data/appData/mysql 2017-09-21 14:43:16 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 2017-09-21 14:43:16 4985 [Note] Plugin 'FEDERATED' is disabled. 2017-09-21 14:43:16 7fe0b7d51720 InnoDB: Warning: Using innodb_additional_mem_pool_size is DEPRECATED. This option may be removed in future releases, together with the option innodb_use_sys_malloc and with the InnoDB's internal memory allocator.
从log中可以看出来,一些InnoDB 表的表空间都有问题,log还明确的告诉你了you can set innodb_force_recovery > 0 in my.cnf
and force InnoDB to continue crash recovery here.
那就试试呗,在/etc/my.cnf
的[mysqld]
下增加 innodb_force_recovery = 1
试试,添加完成后,果然MYSQL启动ok了。
启动ok了,但是凭经验应该会发现好多表,是非ok状态的,具体情况如下:
mysql> check table groups; +---------------+-------+----------+---------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +---------------+-------+----------+---------------------------------------------------------+ | zabbix.groups | check | Warning | InnoDB: Tablespace is missing for table 'zabbix/groups' | | zabbix.groups | check | Error | Table 'zabbix.groups' doesn't exist | | zabbix.groups | check | status | Operation failed | +---------------+-------+----------+---------------------------------------------------------+ 3 rows in set (0.00 sec)
果不其然,就拿上面的groups表状态来说,就是缺少表空间,那咋办,看看repair能够修复不:
mysql> repair table groups; +---------------+--------+----------+---------------------------------------------------------+ | Table | Op | Msg_type | Msg_text | +---------------+--------+----------+---------------------------------------------------------+ | zabbix.groups | repair | Warning | InnoDB: Tablespace is missing for table 'zabbix/groups' | | zabbix.groups | repair | Error | Table 'zabbix.groups' doesn't exist | | zabbix.groups | repair | status | Operation failed | +---------------+--------+----------+---------------------------------------------------------+ 3 rows in set (0.00 sec)
但是不起作用啊,只能再次看看log然后分析问题了,log内容如下:
InnoDB: Error: could not open single-table tablespace file ./zabbix/groups.ibd InnoDB: We do not continue the crash recovery, because the table may become InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
还是跟之前一样,没辙只能上Google找找资料看看了;从网上看到差不多的文章说需要设置如下:
innodb_force_recovery = 6 innodb_purge_thread = 1
先不管什么意思了,先配置上看看能不能修复表了,配置之后果然服务也启动起来了,表也修复好了,具体check table
结果如下:
zabbix.acknowledges check status OK zabbix.actions check status OK zabbix.alerts check status OK zabbix.application_discovery check status OK zabbix.application_prototype check status OK zabbix.application_template check status OK zabbix.applications check status OK zabbix.auditlog check status OK zabbix.auditlog_details check status OK zabbix.autoreg_host check status OK zabbix.conditions check status OK zabbix.config check status OK zabbix.corr_condition check status OK zabbix.corr_condition_group check status OK zabbix.corr_condition_tag check status OK zabbix.corr_condition_tagpair check status OK zabbix.corr_condition_tagvalue check status OK zabbix.corr_operation check status OK zabbix.correlation check status OK zabbix.dbversion check status OK zabbix.dchecks check status OK zabbix.dhosts check status OK zabbix.drules check status OK zabbix.dservices check status OK zabbix.escalations check status OK zabbix.event_recovery check status OK zabbix.event_tag check status OK zabbix.events check status OK zabbix.expressions check status OK zabbix.functions check status OK zabbix.globalmacro check status OK zabbix.globalvars check status OK zabbix.graph_discovery check status OK zabbix.graph_theme check status OK zabbix.graphs check status OK zabbix.graphs_items check status OK zabbix.group_discovery check status OK zabbix.group_prototype check status OK zabbix.groups check status OK zabbix.history check status OK zabbix.history_log check status OK zabbix.history_str check status OK zabbix.history_text check status OK zabbix.history_uint check status OK zabbix.host_discovery check status OK zabbix.host_inventory check status OK zabbix.hostmacro check status OK zabbix.hosts check status OK zabbix.hosts_groups check status OK zabbix.hosts_templates check status OK zabbix.housekeeper check status OK zabbix.httpstep check status OK zabbix.httpstepitem check status OK zabbix.httptest check status OK zabbix.httptestitem check status OK zabbix.icon_map check status OK zabbix.icon_mapping check status OK zabbix.ids check status OK zabbix.images check status OK zabbix.interface check status OK zabbix.interface_discovery check status OK zabbix.item_application_prototype check status OK zabbix.item_condition check status OK zabbix.item_discovery check status OK zabbix.items check status OK zabbix.items_applications check status OK zabbix.maintenances check status OK zabbix.maintenances_groups check status OK zabbix.maintenances_hosts check status OK zabbix.maintenances_windows check status OK zabbix.mappings check status OK zabbix.media check status OK zabbix.media_type check status OK zabbix.opcommand check status OK zabbix.opcommand_grp check status OK zabbix.opcommand_hst check status OK zabbix.opconditions check status OK zabbix.operations check status OK zabbix.opgroup check status OK zabbix.opinventory check status OK zabbix.opmessage check status OK zabbix.opmessage_grp check status OK zabbix.opmessage_usr check status OK zabbix.optemplate check status OK zabbix.problem check status OK zabbix.problem_tag check status OK zabbix.profiles check status OK zabbix.proxy_autoreg_host check status OK zabbix.proxy_dhistory check status OK zabbix.proxy_history check status OK zabbix.regexps check status OK zabbix.rights check status OK zabbix.screen_user check status OK zabbix.screen_usrgrp check status OK zabbix.screens check status OK zabbix.screens_items check status OK zabbix.scripts check status OK zabbix.service_alarms check status OK zabbix.services check status OK zabbix.services_links check status OK zabbix.services_times check status OK zabbix.sessions check status OK zabbix.slides check status OK zabbix.slideshow_user check status OK zabbix.slideshow_usrgrp check status OK zabbix.slideshows check status OK zabbix.sysmap_element_url check status OK zabbix.sysmap_url check status OK zabbix.sysmap_user check status OK zabbix.sysmap_usrgrp check status OK zabbix.sysmaps check status OK zabbix.sysmaps_elements check status OK zabbix.sysmaps_link_triggers check status OK zabbix.sysmaps_links check status OK zabbix.task check status OK zabbix.task_close_problem check status OK zabbix.timeperiods check status OK zabbix.trends check status OK zabbix.trends_uint check status OK zabbix.trigger_depends check status OK zabbix.trigger_discovery check status OK zabbix.trigger_tag check status OK zabbix.triggers check status OK zabbix.users check status OK zabbix.users_groups check status OK zabbix.usrgrp check status OK zabbix.valuemaps check status OK
统计了一下Zabbix库下的127张表,状态都是OK的,那目前来看是没有问题了,手动查询了groups、history_text等几个表查询都是没有问题的。
注: 在check table的时候,遇到大数据的表会比较慢,耐心等待即可!
看起来基本没有问题了,那咱就先把Zabbix Server启动起来看看吧,然后用tail命令MYSQL的error日志中还会有什么异常情况。
麻蛋,发现还有错误,日记如下:
2017-09-21 15:59:21 7f31b1d29700 InnoDB: Error: Table "mysql"."innodb_table_stats" not found. 2017-09-21 15:59:21 7f31b1d29700 InnoDB: Recalculation of persistent statistics requested for table "zabbix"."escalations" but the required persistent statistics storage is not present or is corrupted. Using transient stats instead. 2017-09-21 15:59:31 7f31b1d29700 InnoDB: Error: Table "mysql"."innodb_table_stats" not found. 2017-09-21 15:59:31 7f31b1d29700 InnoDB: Recalculation of persistent statistics requested for table "zabbix"."housekeeper" but the required persistent statistics storage is not present or is corrupted. Using transient stats instead. 2017-09-21 15:59:49 7f31b6933700 InnoDB: Error: table `mysql`.`innodb_table_stats` does not exist in the InnoDB internal InnoDB: data dictionary though MySQL is trying to drop it. InnoDB: Have you copied the .frm file of the table to the InnoDB: MySQL database directory from another database? InnoDB: You can look for further help from InnoDB: http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html 2017-09-21 15:31:05 7f31b68b1700 InnoDB: Error: Table "mysql"."innodb_index_stats" not found. 2017-09-21 15:31:05 7f31b68b1700 InnoDB: Error: Fetch of persistent statistics requested for table "zabbix"."problem_tag" but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead. 2017-09-21 16:00:04 8996 [Warning] InnoDB: Cannot open table mysql/slave_relay_log_info from the internal data dictionary of InnoDB though the .frm file for the table exists. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html for how you can resolve the problem. 2017-09-21 16:00:04 8996 [Warning] InnoDB: Cannot open table mysql/slave_master_info from the internal data dictionary of InnoDB though the .frm file for the table exists. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html for how you can resolve the problem. 2017-09-21 16:00:04 8996 [Warning] InnoDB: Cannot open table mysql/slave_worker_info from the internal data dictionary of InnoDB though the .frm file for the table exists. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html for how you can resolve the problem.
仔细看看都是一些系统表的报错innodb_table_stats
,innodb_index_stats
,slave_master_info
,slave_worker_info
等表的报错。
这种情况之前遇到过,需要清理系统表数据和删除表,然后导入mysql_system_tables.sql
即可,操作如下:
1、登录数据库,进入mysql库,执行如下SQL删除5张表
mysql> use mysql; mysql> drop table if exists innodb_index_stats; mysql> drop table if exists innodb_table_stats; mysql> drop table if exists slave_master_info; mysql> drop table if exists slave_relay_log_info; mysql> drop table if exists slave_worker_info;
执行完后,可以用show tables查看一下,看表的数据是否已经比删除之前减少了,如果减少了,说明你成功了!
记住,一定要是drop table if exists
2、停止数据库,进入到数据库数据文件所在目录,删除上面5个表所对应的idb文件
/etc/init.d/mysqld stop cd data/appData/mysql/mysql rm -rf innodb_index_stats.ibd innodb_table_stats.ibd slave_master_info.ibd slave_relay_log_info.ibd slave_worker_info.ibd
3、重新启动数据库,进入到mysql库,重建上面被删除的表结构
数据库的建表脚本在mysql软件的安装目录的share目录下或者mysql的安装包的script目录下,我的mysql软件的安装路径为/data/app/mysql-3307/
# /etc/init.d/mysqld start # mysql -uxx -poo -P 3307 mysql> use mysql; mysql> source /data/app/mysql-3307/share/mysql/mysql_system_tables.sql; mysql> show tables; +---------------------------+ | Tables_in_mysql | +---------------------------+ | columns_priv | | db | | event | | func | | general_log | | help_category | | help_keyword | | help_relation | | help_topic | | innodb_index_stats | | innodb_table_stats | | ndb_binlog_index | | plugin | | proc | | procs_priv | | proxies_priv | | servers | | slave_master_info | | slave_relay_log_info | | slave_worker_info | | slow_log | | tables_priv | | time_zone | | time_zone_leap_second | | time_zone_name | | time_zone_transition | | time_zone_transition_type | | user | +---------------------------+ 28 rows in set (0.00 sec) mysql> desc innodb_table_stats; +--------------------------+---------------------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +--------------------------+---------------------+------+-----+-------------------+-----------------------------+ | database_name | varchar(64) | NO | PRI | NULL | | | table_name | varchar(64) | NO | PRI | NULL | | | last_update | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | | n_rows | bigint(20) unsigned | NO | | NULL | | | clustered_index_size | bigint(20) unsigned | NO | | NULL | | | sum_of_other_index_sizes | bigint(20) unsigned | NO | | NULL | | +--------------------------+---------------------+------+-----+-------------------+-----------------------------+ 6 rows in set (0.00 sec)
说明表都正常了,再次查看mysql报错日志,就会发现没有了关于这系统表的报错日志,到这里就所有的故障和错误都处理好了。
分析总结
MYSQL参数之innodb_force_recovery
innodb_force_recovery影响整个InnoDB存储引擎的恢复状况。默认为0,表示当需要恢复时执行所有的.
innodb_force_recovery可以设置为1-6,大的数字包含前面所有数字的影响。当设置参数值大于0后,可以对表进行select,create,drop操作,但insert,update或者delete这类操作是不允许的。
(SRV_FORCE_IGNORE_CORRUPT):忽略检查到的corrupt页。
(SRV_FORCE_NO_BACKGROUND):阻止主线程的运行,如主线程需要执行full purge操作,会导致crash。
(SRV_FORCE_NO_TRX_UNDO):不执行事务回滚操作。
(SRV_FORCE_NO_IBUF_MERGE):不执行插入缓冲的合并操作。
(SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重做日志,InnoDB存储引擎会将未提交的事务视为已提交。
(SRV_FORCE_NO_LOG_REDO):不执行前滚的操作。
官网介绍:https://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
MYSQL参数之innodb_purge_threads
innodb_purge_threads 将purge线程从master线程分离出来,提高cpu使用率提升存储引擎性能,innodb1.2之后可以设置多个purge线程。
这里的一个重要知识点就是 对 innodb_force_recovery 参数的理解了,要是遇到数据损坏甚至是其他的损坏。可能上面的方法不行了,需要尝试另一个方法:insert into tb select * from ta limit X;甚至是dump出去,再load回来。






