问题描述
环境介绍:
通过asmca给磁盘组增加磁盘后,节点2上面的磁盘组被dismount force了,手动mount磁盘组的时,报下面错误。
ERROR: diskgroup DATA was not mounted ORA-15032: not all alterations performed ORA-15040: diskgroup is incomplete : ASM disk "102" is missing from group number "2" ERROR: alter diskgroup data mount Tue Sep 30 15:18:54 2014 ORA-15025: could not open disk "/dev/rhdiskpower71" ORA-27041: unable to open file IBM AIX RISC System/6000 Error: 16: Device busy Additional information: 2 Additional information: 4 Tue Sep 30 15:18:54 2014 ASM Health Checker found 1 new failures
专家解答
1,故障现象分析
从上面的报错信息,可以得到下面几点重要的信息
123 1,disk_number是1022,磁盘路径是/dev/rhdiskpower713, 打开文件还回的状态是16:Device busy
这里给出的磁盘路径名有错误,也导致了整个分析走了弯路。
2,OS信息收集
2.1 errpr信息收集
QH-JYFX2@oracle[/home/oracle]errpt 65DE6DE3 0930113114 P S hdisk75 REQUESTED OPERATION CANNOT BE PERFORMED 65DE6DE3 0930113114 P S hdisk147 REQUESTED OPERATION CANNOT BE PERFORMED 65DE6DE3 0930112914 P S hdisk197 REQUESTED OPERATION CANNOT BE PERFORMED 看到大量的报错日志,不过这里需要注意的是这里给出的是HDISK的名字,也不是聚合后的名字
2.2 收集多路径信息
# powermt display dev=all Pseudo name=hdiskpower71 Symmetrix ID=000495900231 Logical device ID=04F9 state=alive; policy=SymmOpt; priority=0; queued-IOs=0 ============================================================================== ---------------- Host --------------- - Stor - -- I/O Path - -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ============================================================================== 1 fscsi2 hdisk147 FA 1fA active alive 0 0 0 fscsi0 hdisk75 FA 2fA active alive 0 0 这里可以看到报错的磁盘正是147,75这两个磁盘 节点1并没有相应的磁盘报错 # errpt|more IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION B0EE9AF5 0930103714 T S hdisk205 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103714 T S hdisk205 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk201 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk206 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk206 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk217 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk217 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk205 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk205 REQUESTED OPERATION CANNOT BE PERFORMED B0EE9AF5 0930103614 T S hdisk204 REQUESTED OPERATION CANNOT BE PERFORMED
所以估计是由于OS层面上的IO问题导致的。
3,kfod分析磁盘
节点1 HTZ1@grid[/oracle/grid/diag/asm/+asm/+ASM2/trace]kfod -------------------------------------------------------------------------------- Disk Size Path User Group ================================================================================ 1: 102401 Mb /dev/rhdiskpower1 grid asmadmin 2: 102401 Mb /dev/rhdiskpower107 grid asmadmin 3: 102401 Mb /dev/rhdiskpower108 grid asmadmin 4: 102401 Mb /dev/rhdiskpower109 grid asmadmin 5: 102401 Mb /dev/rhdiskpower110 grid asmadmin 6: 102401 Mb /dev/rhdiskpower111 grid asmadmin 7: 102401 Mb /dev/rhdiskpower112 grid asmadmin 8: 102401 Mb /dev/rhdiskpower113 grid asmadmin 9: 102401 Mb /dev/rhdiskpower114 grid asmadmin 10: 102401 Mb /dev/rhdiskpower115 grid asmadmin 11: 102401 Mb /dev/rhdiskpower116 grid asmadmin 12: 102401 Mb /dev/rhdiskpower7 grid asmadmin 13: 10240 Mb /dev/rhdiskpower71 grid asmadmin -------------------------------------------------------------------------------- ORACLE_SID ORACLE_HOME ================================================================================ +ASM2 /oracle/product/grid +ASM1 /oracle/product/grid 节点2,故障节点 HTZ2@grid[/home/grid]kfod -------------------------------------------------------------------------------- Disk Size Path User Group ================================================================================ 1: 102401 Mb /dev/rhdiskpower0 grid asmadmin 2: 102401 Mb /dev/rhdiskpower107 grid asmadmin 3: 102401 Mb /dev/rhdiskpower108 grid asmadmin 4: 102401 Mb /dev/rhdiskpower109 grid asmadmin 5: 102401 Mb /dev/rhdiskpower110 grid asmadmin 6: 102401 Mb /dev/rhdiskpower111 grid asmadmin 7: 102401 Mb /dev/rhdiskpower112 grid asmadmin 8: 102401 Mb /dev/rhdiskpower113 grid asmadmin 9: 102401 Mb /dev/rhdiskpower114 grid asmadmin 10: 102401 Mb /dev/rhdiskpower115 grid asmadmin 11: 102401 Mb /dev/rhdiskpower116 grid asmadmin 12: 102401 Mb /dev/rhdiskpower2 grid asmadmin 13: 102401 Mb /dev/rhdiskpower3 grid asmadmin 14: 102401 Mb /dev/rhdiskpower4 grid asmadmin 15: 102401 Mb /dev/rhdiskpower5 grid asmadmin 16: 102401 Mb /dev/rhdiskpower6 grid asmadmin 17: 10240 Mb /dev/rhdiskpower71 grid asmadmin -------------------------------------------------------------------------------- ORACLE_SID ORACLE_HOME ================================================================================ +ASM2 /oracle/product/grid +ASM1 /oracle/product/grid
通过上面的信息可以得到rhdiskpower71的磁盘头是可以正常读取的,并且不属于任何磁盘组,说明在mount磁盘组的时候,报的磁盘路径应该有错误。
4,kfed查看磁盘头的信息
在故障节点上面能正常的读取rhdiskpower71磁盘头的信息,得到DISKGROUP是OCRVOTING
kfed op=read dev=/dev/rhdiskpower71 kfbh.endian: 0 ; 0x000: 0x00 kfbh.hard: 130 ; 0x001: 0x82 kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD kfbh.datfmt: 1 ; 0x003: 0x01 kfbh.block.blk: 0 ; 0x004: blk=0 kfbh.block.obj: 2147483649 ; 0x008: disk=1 kfbh.check: 482783695 ; 0x00c: 0x1cc6b1cf kfbh.fcn.base: 0 ; 0x010: 0x00000000 kfbh.fcn.wrap: 0 ; 0x014: 0x00000000 kfbh.spare1: 0 ; 0x018: 0x00000000 kfbh.spare2: 0 ; 0x01c: 0x00000000 kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8 kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000 kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000 kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000 kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000 kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000 kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000 kfdhdb.compat: 186646528 ; 0x020: 0x0b200000 kfdhdb.dsknum: 1 ; 0x024: 0x0001 kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL kfdhdb.hdrsts: 4 ; 0x027: KFDHDR_FORMER kfdhdb.dskname: OCRVOTING_0001 ; 0x028: length=14 kfdhdb.grpname: OCRVOTING ; 0x048: length=9 kfdhdb.fgname: OCRVOTING_0001 ; 0x068: length=14 kfdhdb.capname: ; 0x088: length=0 kfdhdb.crestmp.hi: 32986709 ; 0x0a8: HOUR=0x15 DAYS=0x12 MNTH=0x5 YEAR=0 x7dd kfdhdb.crestmp.lo: 3509124096 ; 0x0ac: USEC=0x0 MSEC=0x23f SECS=0x12 MINS=0x34 kfdhdb.mntstmp.hi: 32986709 ; 0x0b0: HOUR=0x15 DAYS=0x12 MNTH=0x5 YEAR=0x7dd kfdhdb.mntstmp.lo: 3534094336 ; 0x0b4: USEC=0x0 MSEC=0x180 SECS=0x2a MINS=0x34 这里可以看到时间是2013年。 并且ASM中已经无此磁盘组信息。 所以可以判断上面在mount报错信息的时候,给出的磁盘名可以有错误
5 查询V$ASM_DISK视图
1节点,正常节点 SQL> select group_number,disk_number,name,path from v$asm_disk order by group_number,disk_number; GROUP_NUMBER DISK_NUMBER NAME PATH ------------ ----------- ------------------------------ -------------------- 0 61 /dev/rhdiskpower71 ................................................................... 2 101 DATA_0101 /dev/rhdiskpower97 2 102 DATA_0102 /dev/rhdiskpower96 这里有两个地方需要注意的,rhdiskpower71对应的DISK_NUMBER是61号,而2节点对的是102,这里只是alert报出来是102 在这里DISK_NUMBER 102对应的是rhdiskpower96这个磁盘 2节点,在故障节点, SQL> select group_number,disk_number,name,path from v$asm_disk order by group_number,disk_number; GROUP_NUMBER DISK_NUMBER NAME PATH ------------ ----------- ------------------------------ -------------------- 0 0 /dev/rhdiskpower71 .................................................................... 0 104 /dev/rhdiskpower94 0 105 /dev/rhdiskpower95 0 106 /dev/rhdiskpower97 这里可以看到71这个磁盘对应的是DISK_NUMBER是0,在整个v$asm_disk中并没有找到96这个盘,说明节点2不能访问96这个磁盘的磁盘头。 所以我们怀疑应该是/dev/rhdiskpower96盘导致故障的
6,怀疑是多路径软件生成有问题
原来以为是多路径软件有问题,但最终排除不是由于多路径软件导致的。用到下面的几条命令,但是记录没有保存,下面就拿另一个盘来测试一下。
需要在两个节点上面同时运行,对比数据是否一致
# powermt display dev=all Pseudo name=hdiskpower95 Symmetrix ID=000495900231 Logical device ID=0672 state=alive; policy=SymmOpt; priority=0; queued-IOs=0 ============================================================================== ---------------- Host --------------- - Stor - -- I/O Path - -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ============================================================================== 0 fscsi0 hdisk171 FA 2fA active alive 0 0 1 fscsi2 hdisk196 FA 1fA active alive 0 0 # lscfg -vl hdisk171 hdisk171 U78AA.001.WZSHWF4-P1-C2-T1-W5000097910039D44-L60000000000000 EMC Symmetrix FCP VRAID Manufacturer................EMC Machine Type and Model......SYMMETRIX ROS Level and ID............5876 Serial Number...............31672520 Part Number.................000000000000400551000495 EC Level....................900231 LIC Node VPD................0672 Device Specific.(Z0)........04 Device Specific.(Z1)........40 Device Specific.(Z2)........000000300D9C09000260019C Device Specific.(Z3)........12000000 Device Specific.(Z4)........54130000 Device Specific.(Z5)........E780 Device Specific.(Z6)........45 # lscfg -vl hdisk196 hdisk196 U78AA.001.WZSHWF4-P1-C3-T1-W5000097910039D40-L60000000000000 EMC Symmetrix FCP VRAID Manufacturer................EMC Machine Type and Model......SYMMETRIX ROS Level and ID............5876 Serial Number...............31672510 Part Number.................000000000000400550000495 EC Level....................900231 LIC Node VPD................0672 Device Specific.(Z0)........04 Device Specific.(Z1)........40 Device Specific.(Z2)........000000300D9C09000260019C Device Specific.(Z3)........12000000 Device Specific.(Z4)........54130000 Device Specific.(Z5)........E780 Device Specific.(Z6)........45
7 dd测试磁盘的访问
在之前的kfod中我们也没有发现96这个磁盘,所以估计是不能正常访问96这个磁盘 HTZ2@grid[/home/grid]ls -l /dev/rhdiskpower96 crw-rw---- 1 grid asmadmin 43, 96 Nov 21 2013 /dev/rhdiskpower96 QH-JYFX2@grid[/home/grid]dd if=/dev/rhdiskpower96 of=/tmp/test bs=4K count=1 dd: /dev/rhdiskpower96: The requested resource is busy. HTZ1@grid[/oracle/grid/admin/olm]ls -l /dev/rhdiskpower96 crw-rw---- 1 grid asmadmin 43, 96 Sep 30 13:34 /dev/rhdiskpower96 这里查看权限是正常的,dd访问的时候报资源忙,不知道是什么原因,当时没有在正常的节点上面测试设备是否忙,下次上线的时候,需要先测试一把。
# for i in 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 > do echo $i lsattr -El hdiskpower`echo $i`|grep reserve done> > > 90 reserve_lock no Reserve device on open True 91 reserve_lock no Reserve device on open True 92 reserve_lock no Reserve device on open True 93 reserve_lock no Reserve device on open True 94 reserve_lock no Reserve device on open True 95 reserve_lock no Reserve device on open True 96 reserve_lock yes Reserve device on open True 97 reserve_lock no Reserve device on open True 失败的节点 # for i in 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 > do echo $i lsattr -El hdiskpower`echo $i`|grep reserve done> > > 90 reserve_lock no Reserve device on open True 91 reserve_lock no Reserve device on open True 92 reserve_lock no Reserve device on open True 93 reserve_lock no Reserve device on open True 94 reserve_lock no Reserve device on open True 95 reserve_lock no Reserve device on open True 96 reserve_lock no Reserve device on open True 97 reserve_lock no Reserve device on open True
这里发现在节点1上面磁盘96的reserve_lock的属性为yes,但是其它的都为no,官方给出的文档也是要求此值的属性为no。不过在高版本(5.5)的多路径软件,应该是reserve_policy这个属性值。
9,故障的处理
由于磁盘正在使用,不能通过chdev来更改磁盘的reserve_lock的属性,也不能停业务,所以这里我们使用drop disk的方式来实现。
# chdev -l hdiskpower96 -a reserve_lock=no Method error (/etc/methods/chgpowerdisk): 0514-062 Cannot perform the requested function because the specified device is busy. SQL> alter diskgroup data drop disk 'DATA_0102' NOTE: GroupBlock outside rolling migration privileged region Tue Sep 30 18:15:58 2014 NOTE: stopping process ARB0 NOTE: rebalance interrupted for group 2/0x5aa7d83 (DATA) NOTE: requesting all-instance membership refresh for group=2 NOTE: membership refresh pending for group 2/0x5aa7d83 (DATA) Tue Sep 30 18:16:07 2014 GMON querying group 2 at 155 for pid 18, osid 6095188 SUCCESS: refreshed membership for 2/0x5aa7d83 (DATA) SUCCESS: alter diskgroup data drop disk 'DATA_0102' NOTE: starting rebalance of group 2/0x5aa7d83 (DATA) at power 1 Starting background process ARB0 Tue Sep 30 18:16:07 2014 ARB0 started with pid=32, OS id=39846174 NOTE: assigning ARB0 to group 2/0x5aa7d83 (DATA) with 1 parallel I/O Tue Sep 30 18:16:10 2014 NOTE: Attempting voting file refresh on diskgroup DATA
如果速度加还rebalance的速度,可以手动指定数字,默认为1.
手动mount节点2磁盘组,正常。故障处理完成。