故障现象描述
某平台数据库备库服务器系统出现大量系统配置文件报错,进而导致系统崩溃,然后宕机。
故障分析总结
通过在启动系统时,根据启动的信息提示,该问题的原因是系统在崩溃前,提示很多/etc目录下的配置文件为只读文件,不能正常访问和读写。
由于系统文件损坏,系统不能正常恢复,所以考虑重新安装系统,配置dataguard备库,业务恢复正常。
问题诊断及解决
1. 问题诊断
Mon Aug 17 11:00:02 CST 2015
Errors in file /opt/oracle/admin/oric/udump/oric_ora_13154.trc:
ORA-00206: error in writing (block 814, # blocks 1) of control file
ORA-00202: control file: '/emcoradata/oric/controlfile/control02.ctl'
ORA-27041: unable to open file
Linux-x86_64 Error: 30: Read-only file system
Additional information: 3
ORA-00206: error in writing (block 814, # blocks 1) of control file
ORA-00202: control file: '/emcoradata/oric/controlfile/control01.ctl'
ORA-27041: unable to open file
Linux-x86_64 Error: 30: Read-only file system
Additional information: 3
dbbak:/emcoradata/oric/controlfile/ # vi init.ora
vi: cannot vi `init.ora': Read-only file system
dbbak:/etc/init.d/ # chmod 640 init.crs
chmod: cannot chmod `init.crs': Read-only file system
此时,系统发生崩溃,直接宕机。
Waiting for device dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3to appear:ok
Rootfs:major=104 minor=3 devn=26627
Fsck 1.38(30-jun-2005)
[/binfsck.ext3(1)--/]fsck.ext3–a/dev/disk/by-id/cciss-3600508b100184d3953594331564a00064a-part3
/dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3 recovering journal
/dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3 contains a file system with errors,check forced.
/dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3:duplicate or bad block in use!
/dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3:multiply-claimed block(s) in inode 32921:86016
/dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3:multiply-claimed block(s) in inode 33152:86017
/dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3:multiply-claimed block(s) in inode 33189:86018
/dev/disk/by-id/cciss-3600508b100184d3953594331564a0006-part3:multiply-claimed block(s) in inode:37220:86016 86017 86018
Illegal block number passed to ext2fs_test_block_bitmap #33554432 for multiply claimed block map
INIT: execute “/sbin/mingetty”
Init:connot execute “/sbin/mingetty”
Id “5”respawning too fsat:disabled for 5 minutes
Id “2”respawning too fsat:disabled for 5 minutes
Id “3”respawning too fsat:disabled for 5 minutes
Id “1”respawning too fsat:disabled for 5 minutes
Id “6”respawning too fsat:disabled for 5 minutes
Id “4”respawning too fsat:disabled for 5 minutes
2. 故障处理
重新安装系统,然后配置dataguard,至此,业复正常。
改进措施
针对本次故障产生的原因,我们提出几点建议:

本文作者:王 伟(上海新炬王翦团队)
本文来源:“IT那活儿”公众号

文章转载自IT那活儿,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




