暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

rac环境模拟vote盘和data盘磁盘头损坏的修复

原创 Leo 2022-12-28
504

文档课题:rac环境模拟vote盘和data盘磁盘头损坏的修复.

系统:centos 7.9 64位

数据库:oracle 11.2.0.4 64位

环境:rac (两节点)

1、磁盘组信息

1.1、系统信息

[root@hisdb1 ~]# cat /etc/*release

CentOS Linux release 7.9.2009 (Core)

NAME="CentOS Linux"

VERSION="7 (Core)"

ID="centos"

ID_LIKE="rhel fedora"

VERSION_ID="7"

PRETTY_NAME="CentOS Linux 7 (Core)"

ANSI_COLOR="0;31"

CPE_NAME="cpe:/o:centos:centos:7"

HOME_URL="https://www.centos.org/"

BUG_REPORT_URL="https://bugs.centos.org/"

 

CENTOS_MANTISBT_PROJECT="CentOS-7"

CENTOS_MANTISBT_PROJECT_VERSION="7"

REDHAT_SUPPORT_PRODUCT="centos"

REDHAT_SUPPORT_PRODUCT_VERSION="7"

 

CentOS Linux release 7.9.2009 (Core)

CentOS Linux release 7.9.2009 (Core)

1.2、磁盘信息

SQL> select group_number,name,path,state,total_mb,free_mb from v$asm_disk where name is not null order by path;

 

GROUP_NUMBER NAME   PATH     STATE      TOTAL_MB    FREE_MB

------------ --------------- -------------------- -------- ---------- ----------

           2 DATA02   ORCL:DATA02          NORMAL        10239       6662

           1 DATA03   ORCL:DATA03          NORMAL        20479      13765

           3 DATA04   ORCL:DATA04          NORMAL        10239       9843

SQL> select group_number,name,type,total_mb,free_mb from v$asm_diskgroup;

 

GROUP_NUMBER NAME      TYPE     TOTAL_MB    FREE_MB

------------ --------------- ------ ---------- ----------

           1 DATA            EXTERN      20479      13765

           2 FRA             EXTERN      10239       6662

           3 OCRBK          EXTERN      10239       9843

[root@hisdb1 disks]# pwd

/dev/oracleasm/disks

[root@hisdb1 disks]# ll /dev/oracleasm/disks/*

brw-rw---- 1 grid asmadmin 8, 17 Dec 27 20:27 /dev/oracleasm/disks/DATA01

brw-rw---- 1 grid asmadmin 8, 33 Dec 27 20:27 /dev/oracleasm/disks/DATA02

brw-rw---- 1 grid asmadmin 8, 49 Dec 27 20:27 /dev/oracleasm/disks/DATA03

brw-rw---- 1 grid asmadmin 8, 65 Dec 27 20:27 /dev/oracleasm/disks/DATA04

说明:以上DATA04对应vote盘,DATA03对应data盘.

2、vote盘

模拟vote盘的损坏以及修复.

2.1、拷贝数据

--从/dev/oracleasm/disks/DATA04拷贝1个8k的块到/home/grid/data04.dd

[grid@hisdb1 disks]$ dd if=/dev/oracleasm/disks/DATA04 of=/home/grid/data04.dd bs=8192 count=1

1+0 records in

1+0 records out

8192 bytes (8.2 kB) copied, 0.000340858 s, 24.0 MB/s

[grid@hisdb1 ~]$ ll data04.dd

-rw-r--r-- 1 grid oinstall 8192 Dec 27 21:32 data04.dd

--借助kfed读取/dev/oracleasm/disks/DATA04磁盘头信息.

[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA04 text=data04.txt

[grid@hisdb1 ~]$ head data04.txt

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD

kfbh.datfmt:                          1 ; 0x003: 0x01

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:              2147483648 ; 0x008: disk=0

kfbh.check:                  3855329304 ; 0x00c: 0xe5cba818

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

2.2、损坏磁盘

--破坏votedisk磁盘组的磁盘

[grid@hisdb1 ~]$ dd if=/dev/zero of=/dev/oracleasm/disks/DATA04 bs=8192 count=1

1+0 records in

1+0 records out

8192 bytes (8.2 kB) copied, 0.000128522 s, 63.7 MB/s

[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA04 | head

kfbh.endian:                          0 ; 0x000: 0x00

kfbh.hard:                            0 ; 0x001: 0x00

kfbh.type:                            0 ; 0x002: KFBTYP_INVALID

kfbh.datfmt:                          0 ; 0x003: 0x00

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:                       0 ; 0x008: file=0

kfbh.check:                           0 ; 0x00c: 0x00000000

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

2.3、异常重现

--重启集群

[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl stop cluster -all

CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb1'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb1'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb1'

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.hisdb1.vip' on 'hisdb1'

CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.heal.db' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb1'

CRS-2677: Stop of 'ora.hisdb1.vip' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb2'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb2'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.cvu' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.oc4j' on 'hisdb2'

CRS-2677: Stop of 'ora.cvu' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.hisdb2.vip' on 'hisdb2'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.scan1.vip' on 'hisdb2'

CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.heal.db' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb2'

CRS-2677: Stop of 'ora.hisdb2.vip' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.scan1.vip' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'

CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'hisdb1'

CRS-2677: Stop of 'ora.oc4j' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.ons' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb1'

CRS-2677: Stop of 'ora.net1.network' on 'hisdb1' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb1' has completed

CRS-2677: Stop of 'ora.crsd' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'

CRS-2677: Stop of 'ora.evmd' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'

CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'hisdb2'

CRS-2677: Stop of 'ora.ons' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb2'

CRS-2677: Stop of 'ora.net1.network' on 'hisdb2' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb2' has completed

CRS-2677: Stop of 'ora.crsd' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'

CRS-2677: Stop of 'ora.evmd' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.ctssd' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb1'

CRS-2677: Stop of 'ora.ctssd' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb1'

CRS-2677: Stop of 'ora.cssd' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb2'

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb2'

CRS-2677: Stop of 'ora.cssd' on 'hisdb2' succeeded

[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl start cluster -all

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb2'

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb1'

CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb2' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'hisdb2'

CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb1' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'hisdb1'

CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb2'

CRS-2676: Start of 'ora.diskmon' on 'hisdb2' succeeded

CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb1'

CRS-2676: Start of 'ora.diskmon' on 'hisdb1' succeeded

……

说明:此时会一直hang住,因为损坏的是投票盘,集群无法启动.

2.4、相关告警

--ocssd.log不断报如下错误:

2022-12-27 22:17:15.937: [    CSSD][3821278976]clssnmvDiskVerify: Successful discovery of 0 disks

2022-12-27 22:17:15.937: [    CSSD][3821278976]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery

2022-12-27 22:17:15.937: [    CSSD][3821278976]clssnmvFindInitialConfigs: No voting files found

2022-12-27 22:17:15.937: [    CSSD][3821278976](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds

2022-12-27 22:17:15.996: [    CSSD][3823675136]clssscSelect: cookie accept request 0x7fa0d80845c0

2022-12-27 22:17:15.996: [    CSSD][3823675136]clssscevtypSHRCON: getting client with cmproc 0x7fa0d80845c0

2022-12-27 22:17:15.996: [    CSSD][3823675136]clssgmRegisterClient: proc(4/0x7fa0d80845c0), client(358/0x7fa0d8071230)

2022-12-27 22:17:15.996: [    CSSD][3823675136]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x7fa0d80845c0) client(0x7fa0d8071230)

2022-12-27 22:17:15.996: [    CSSD][3823675136]clssgmDiscEndpcl: gipcDestroy 0x5976

2022-12-27 22:17:16.329: [    CSSD][3823675136]clssscSelect: cookie accept request 0x7fa0d8099e80

2022-12-27 22:17:16.329: [    CSSD][3823675136]clssscevtypSHRCON: getting client with cmproc 0x7fa0d8099e80

2022-12-27 22:17:16.329: [    CSSD][3823675136]clssgmRegisterClient: proc(5/0x7fa0d8099e80), client(357/0x7fa0d8071230)

2022-12-27 22:17:16.329: [    CSSD][3823675136]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x7fa0d8099e80) client(0x7fa0d8071230)

2022-12-27 22:17:16.329: [    CSSD][3823675136]clssgmDiscEndpcl: gipcDestroy 0x598c

2022-12-27 22:17:16.998: [    CSSD][3823675136]clssscSelect: cookie accept request 0x7fa0d80845c0

2022-12-27 22:17:16.998: [    CSSD][3823675136]clssscevtypSHRCON: getting client with cmproc 0x7fa0d80845c0

2022-12-27 22:17:16.998: [    CSSD][3823675136]clssgmRegisterClient: proc(4/0x7fa0d80845c0), client(359/0x7fa0d8071230)

2022-12-27 22:17:16.998: [    CSSD][3823675136]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x7fa0d80845c0) client(0x7fa0d8071230)

-- alerthisdb1.log报错如下

[grid@hisdb1 hisdb1]$ tail -5000f alerthisdb1.log

每隔15s如下错误

[cssd(7816)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/hisdb1/cssd/ocssd.log

2.5、恢复vote磁盘

[grid@hisdb1 ~]$ kfed repair /dev/oracleasm/disks/DATA04

说明:修复成功后,集群恢复正常.

3、data盘

模拟data盘的损坏和修复.

3.1、拷贝数据

[grid@hisdb1 ~]$ dd if=/dev/oracleasm/disks/DATA03 of=/home/grid/data03.dd bs=8192 count=1 

1+0 records in

1+0 records out

8192 bytes (8.2 kB) copied, 0.000373797 s, 21.9 MB/s

[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA03 | head

kfbh.endian:                          1 ; 0x000: 0x01

kfbh.hard:                          130 ; 0x001: 0x82

kfbh.type:                            1 ; 0x002: KFBTYP_DISKHEAD

kfbh.datfmt:                          1 ; 0x003: 0x01

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:              2147483648 ; 0x008: disk=0

kfbh.check:                  3875939376 ; 0x00c: 0xe7062430

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

3.2、损坏磁盘

[grid@hisdb1 ~]$ dd if=/dev/zero of=/dev/oracleasm/disks/DATA03 bs=8192 count=1

1+0 records in

1+0 records out

8192 bytes (8.2 kB) copied, 0.000199175 s, 41.1 MB/s

[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA03 | head

kfbh.endian:                          0 ; 0x000: 0x00

kfbh.hard:                            0 ; 0x001: 0x00

kfbh.type:                            0 ; 0x002: KFBTYP_INVALID

kfbh.datfmt:                          0 ; 0x003: 0x00

kfbh.block.blk:                       0 ; 0x004: blk=0

kfbh.block.obj:                       0 ; 0x008: file=0

kfbh.check:                           0 ; 0x00c: 0x00000000

kfbh.fcn.base:                        0 ; 0x010: 0x00000000

kfbh.fcn.wrap:                        0 ; 0x014: 0x00000000

kfbh.spare1:                          0 ; 0x018: 0x00000000

3.3、异常重现

[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl stop cluster -all

CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb2'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb1'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.cvu' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.oc4j' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb1'

CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb2'

CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb2'

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.hisdb2.vip' on 'hisdb2'

CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.scan1.vip' on 'hisdb1'

CRS-2677: Stop of 'ora.cvu' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.hisdb1.vip' on 'hisdb1'

CRS-2677: Stop of 'ora.heal.db' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb2'

CRS-2677: Stop of 'ora.heal.db' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb1'

CRS-2677: Stop of 'ora.hisdb2.vip' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.scan1.vip' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.hisdb1.vip' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.oc4j' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'

CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'

CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.ons' on 'hisdb2'

CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.ons' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb2'

CRS-2677: Stop of 'ora.net1.network' on 'hisdb2' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb2' has completed

CRS-2673: Attempting to stop 'ora.ons' on 'hisdb1'

CRS-2677: Stop of 'ora.ons' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb1'

CRS-2677: Stop of 'ora.net1.network' on 'hisdb1' succeeded

CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb1' has completed

CRS-2677: Stop of 'ora.crsd' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb2'

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'

CRS-2677: Stop of 'ora.crsd' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb1'

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'

CRS-2677: Stop of 'ora.evmd' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.evmd' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.ctssd' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.ctssd' on 'hisdb2' succeeded

CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb2'

CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb1'

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb1' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb1'

CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb2' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb2'

CRS-2677: Stop of 'ora.cssd' on 'hisdb1' succeeded

CRS-2677: Stop of 'ora.cssd' on 'hisdb2' succeeded

[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl start cluster -all 

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb1'

CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb2'

CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb1' succeeded

CRS-2672: Attempting to start 'ora.cssd' on 'hisdb1'

CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb2' succeeded

CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb1'

CRS-2672: Attempting to start 'ora.cssd' on 'hisdb2'

CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb2'

CRS-2676: Start of 'ora.diskmon' on 'hisdb1' succeeded

CRS-2676: Start of 'ora.diskmon' on 'hisdb2' succeeded

CRS-2676: Start of 'ora.cssd' on 'hisdb1' succeeded

CRS-2672: Attempting to start 'ora.ctssd' on 'hisdb1'

CRS-2676: Start of 'ora.cssd' on 'hisdb2' succeeded

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'hisdb1'

CRS-2672: Attempting to start 'ora.ctssd' on 'hisdb2'

CRS-2676: Start of 'ora.ctssd' on 'hisdb2' succeeded

CRS-2672: Attempting to start 'ora.evmd' on 'hisdb2'

CRS-2676: Start of 'ora.ctssd' on 'hisdb1' succeeded

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'hisdb2'

CRS-2672: Attempting to start 'ora.evmd' on 'hisdb1'

CRS-2676: Start of 'ora.evmd' on 'hisdb2' succeeded

CRS-2676: Start of 'ora.evmd' on 'hisdb1' succeeded

CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'hisdb1' succeeded

CRS-2672: Attempting to start 'ora.asm' on 'hisdb1'

CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'hisdb2' succeeded

CRS-2672: Attempting to start 'ora.asm' on 'hisdb2'

CRS-2676: Start of 'ora.asm' on 'hisdb1' succeeded

CRS-2672: Attempting to start 'ora.crsd' on 'hisdb1'

CRS-2676: Start of 'ora.asm' on 'hisdb2' succeeded

CRS-2672: Attempting to start 'ora.crsd' on 'hisdb2'

CRS-2676: Start of 'ora.crsd' on 'hisdb1' succeeded

CRS-2676: Start of 'ora.crsd' on 'hisdb2' succeeded

说明:集群能成功开启,但无法打开实例,因为实例的相关数据文件全在data磁盘组.

3.4、相关异常

[grid@hisdb1 ~]$ sqlplus / as sysasm

 

SQL*Plus: Release 11.2.0.4.0 Production on Tue Dec 27 22:46:21 2022

 

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

 

 

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

 

SQL> col name for a20

SQL> col path for a40

SQL> set line 160

SQL> select name,total_mb,usable_file_mb,state from v$asm_diskgroup;

 

NAME                   TOTAL_MB USABLE_FILE_MB STATE

-------------------- ---------- -------------- -----------

FRA                       10239           6624 MOUNTED

OCRBK                     10239           9843 MOUNTED

SQL> alter diskgroup data mount;

alter diskgroup data mount

*

ERROR at line 1:

ORA-15032: not all alterations performed

ORA-15017: diskgroup "DATA" cannot be mounted

ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

说明:可以看到data磁盘无法挂载.

[grid@hisdb1 hisdb1]$ tail -5000f alerthisdb1.log

2022-12-27 22:46:18.033:

[crsd(10020)]CRS-2807:Resource 'ora.DATA.dg' failed to start automatically.

2022-12-27 22:46:18.033:

[crsd(10020)]CRS-2807:Resource 'ora.DATA.dg' failed to start automatically.

2022-12-27 22:46:18.033:

[crsd(10020)]CRS-2807:Resource 'ora.heal.db' failed to start automatically.

2022-12-27 22:46:18.033:

[crsd(10020)]CRS-2807:Resource 'ora.heal.db' failed to start automatically.

说明:集群告警日志如上.

SQL> select group_number,name,path,state,total_mb,free_mb from v$asm_disk;

 

GROUP_NUMBER NAME   PATH  STATE      TOTAL_MB    FREE_MB

------------ -------------------- --------------- -------- ---------- ----------

           0            ORCL:DATA01     NORMAL            0          0

           0            ORCL:DATA03     NORMAL            0          0

           2 DATA02     ORCL:DATA02     NORMAL        10239       6624

           3 DATA04     ORCL:DATA04     NORMAL        10239       9843

[grid@hisdb2 hisdb2]$ crsctl stat res -t

--------------------------------------------------------------------------------

NAME           TARGET  STATE        SERVER                   STATE_DETAILS      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

               ONLINE  OFFLINE      hisdb1                                      

               ONLINE  OFFLINE      hisdb2                                      

ora.FRA.dg

               ONLINE  ONLINE       hisdb1                                      

               ONLINE  ONLINE       hisdb2                                      

ora.LISTENER.lsnr

               ONLINE  ONLINE       hisdb1                                      

               ONLINE  ONLINE       hisdb2                                      

ora.OCRBK.dg

               ONLINE  ONLINE       hisdb1                                      

               ONLINE  ONLINE       hisdb2                                      

ora.asm

               ONLINE  ONLINE       hisdb1                   Started             

               ONLINE  ONLINE       hisdb2                   Started            

ora.gsd

               OFFLINE OFFLINE      hisdb1                                      

               OFFLINE OFFLINE      hisdb2                                       

ora.net1.network

               ONLINE  ONLINE       hisdb1                                      

               ONLINE  ONLINE       hisdb2                                      

ora.ons

               ONLINE  ONLINE       hisdb1                                       

               ONLINE  ONLINE       hisdb2                                      

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       hisdb2                                      

ora.cvu

      1        ONLINE  ONLINE       hisdb1                                      

ora.heal.db

      1        ONLINE  OFFLINE                               Instance Shutdown  

      2        ONLINE  OFFLINE                               Instance Shutdown  

ora.hisdb1.vip

      1        ONLINE  ONLINE       hisdb1                                      

ora.hisdb2.vip

      1        ONLINE  ONLINE       hisdb2                                      

ora.oc4j

      1        ONLINE  ONLINE       hisdb1                                      

ora.orcl.db

      1        OFFLINE OFFLINE                               Instance Shutdown  

      2        OFFLINE OFFLINE                               Instance Shutdown  

ora.scan1.vip

      1        ONLINE  ONLINE       hisdb2       

说明:集群状态显示异常,heal数据库无法开启.

3.5、恢复data磁盘

[grid@hisdb1 ~]$ kfed repair /dev/oracleasm/disks/DATA03

[grid@hisdb1 ~]$ sqlplus / as sysasm

 

SQL*Plus: Release 11.2.0.4.0 Production on Tue Dec 27 22:54:47 2022

 

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

 

 

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

With the Real Application Clusters and Automatic Storage Management options

 

SQL> alter diskgroup data mount;

 

Diskgroup altered.

SQL> select group_number,name,path,state,total_mb,free_mb from v$asm_disk;

 

GROUP_NUMBER NAME   PATH           STATE      TOTAL_MB    FREE_MB

------------ --------------- ------------------------- -------- ---------- ----------

           0          ORCL:DATA01        NORMAL            0          0

           2 DATA02  ORCL:DATA02        NORMAL        10239       6618

           1 DATA03  ORCL:DATA03        NORMAL        20479      13765

           3 DATA04  ORCL:DATA04        NORMAL        10239       9843

说明:data磁盘修复成功后,集群恢复正常.

 

参考文档:

https://www.modb.pro/db/22060

https://blog.csdn.net/jycjyc/article/details/106275991

https://blog.51cto.com/lhrbest/2699983

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论