暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

RAC on Windows: How to Replace Voting Disks "In Place" on the same ASM Diskgroup

原创 eygle 2020-08-27
2467

Applies to:
Oracle Database - Enterprise Edition - Version 12.1.0.2 and later
Microsoft Windows x64 (64-bit)

Symptoms

Following Windows patch application which included a reboot during which time the Oracle Clusterware was not brought down cleanly ahead of time, CRS will not start.

Specifically ora.ocssd fails to start in both nodes, with the following errors reported:

$TRACE\ocssd.trc
CRS-8503 [] [] [] [] [] [] [] [] [] [] [] []
Incident details in: $INCIDENT\incdir_1\ocssd_i1.trc
2019-02-28 08:07:21.431 [OCSSD(8072)]CRS-1656: The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in $TRACE\ocssd.trc
2019-02-28 08:07:21.432 [OCSSD(8072)]CRS-1603: CSSD on node <nodename1> shutdown by user.
2019-02-28 08:07:26.441 [OCSSD(8072)]CRS-8503: Oracle Clusterware OCSSD process with operating system process ID 6024 experienced fatal signal or exception code -1073741819


$INCIDENT\incdir_1\ocssd_i1.trc

2019-02-27 22:45:07.649225 :SKGFD:7612: running stat on disk:\\.\ORCLDISKDATA0
2019-02-27 22:45:07.665603 :SKGFD:7612: Warning :  skgfr_vpd84h (scsi error for vpd84h 0x5) sense data:

2019-02-27 22:45:07.665609 :SKGFD:7612:
 SCSI sense  len(0x20)

...

2019-02-27 22:45:07.842828 :CLSF:7612: checksum failed for disk:\\.\ORCLDISKCRSVOTE0:
2019-02-27 22:45:07.842830 :CLSF:7612: Error: obj 2147483648 blk 0 name 'hard_kfbh' flags 0x65 first 1

The voting disk files are visible on the server nodes. Their headers even appear readable (as demonstrated using 'kfed read '). However, due to the fact that OCSSD itself will not come up, it is clear the voting disks are damaged or corrupted in some way.
Changes

Cause

Voting disk corruption. Likely related to having applied OS patches w/o first cleanly shutting down Oracle Clusterware.

Voting disk corruption required that we recreate the voting disks.

Oracle Clusterware must be explicitly stopped (crsctl stop crs) prior to any OS patch application.

Solution

1.  ensure Oracle Clusterware is completely stopped on all nodes (crsctl stop crs -f) including that the OracleOHService is not started

2.  OFFLINE the shared disks that make up the relevant diskgroup (via diskmgmt.msc)

3.  start CRS in exclusive mode: crsctl start crs -excl -nocrs

4.  At that point ora.cssd can start

5.  ONLINE the shared disks that make up the relevant diskgroup (via diskmgmt.msc)

6.  connect to the ASM instance (via SQLPLUS) and mount the CRSVOTE diskgroup):
select name, state from v$asm_diskgroup;
alter diskgroup <relevant diskgroup> mount;

7.  start ora.crsd resource: crsctl start res ora.crsd -init

8.  crsctl replace votedisk +CRSVOTE

9.  crsctl stop crs -f

10.  crsctl start crs

11. crsctl start crs (on all additional nodes)
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论