暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

ASM资源OFFLINE故障之禁用HAIP

原创 哈萨雅琪 2022-06-30
2341

6月30号10点,应用反馈连接某个节点提示 nolistener.


我进去数据库一看,已经挂了十几天了,问了一下,数据库上已经无应用连接,此次是因为迁移数据需要,才重新来连接数据库。

此时数据库资源已经无法启动,ora.asm资源OFFLINE,即使手动startup了ASM实例,仍旧不能使其online。


虽然有smon进程,磁盘也是mounted,但是资源仍旧是OFFLINE,过了一会儿看db再次挂掉。


由于此次仅是迁移数据需要,且比较紧急,于是我采用了比较粗鲁的解决方法。

1.手动启动asm资源:


是haip资源故障,导致了ASM资源无法启动。以下选择直接禁用haip。

2.禁用haip

#停止所有节点的crs。因为另一个节点我已经无法登陆,且也已经是挂掉的状态,所以只对本节点进行操作:

 [root@gzsj2 ~]# /oracle/app/11.2.0/grid/bin/crsctl stop crs -f 

CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'gzsj2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'gzsj2'
CRS-2673: Attempting to stop 'ora.ctssd' on 'gzsj2'
CRS-2673: Attempting to stop 'ora.evmd' on 'gzsj2'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'gzsj2'
CRS-2677: Stop of 'ora.evmd' on 'gzsj2' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'gzsj2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'gzsj2'
CRS-2677: Stop of 'ora.drivers.acfs' on 'gzsj2' succeeded
CRS-2677: Stop of 'ora.mdnsd' on 'gzsj2' succeeded
CRS-2677: Stop of 'ora.cssd' on 'gzsj2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'gzsj2'
CRS-2677: Stop of 'ora.gipcd' on 'gzsj2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'gzsj2'
CRS-2677: Stop of 'ora.gpnpd' on 'gzsj2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'gzsj2' has completed
CRS-4133: Oracle High Availability Services has been stopped.

#以独占模式且不启动crs的状态下启动集群

[root@gzsj2 ~]# /oracle/app/11.2.0/grid/bin/crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'gzsj2'
CRS-2676: Start of 'ora.mdnsd' on 'gzsj2' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'gzsj2'
CRS-2676: Start of 'ora.gpnpd' on 'gzsj2' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'gzsj2'
CRS-2672: Attempting to start 'ora.gipcd' on 'gzsj2'
CRS-2676: Start of 'ora.cssdmonitor' on 'gzsj2' succeeded
CRS-2676: Start of 'ora.gipcd' on 'gzsj2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'gzsj2'
CRS-2672: Attempting to start 'ora.diskmon' on 'gzsj2'
CRS-2676: Start of 'ora.diskmon' on 'gzsj2' succeeded
CRS-2676: Start of 'ora.cssd' on 'gzsj2' succeeded
CRS-2672: Attempting to start 'ora.drivers.acfs' on 'gzsj2'
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'gzsj2'
CRS-2672: Attempting to start 'ora.ctssd' on 'gzsj2'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'gzsj2' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'gzsj2'
CRS-2676: Start of 'ora.drivers.acfs' on 'gzsj2' succeeded
CRS-2676: Start of 'ora.ctssd' on 'gzsj2' succeeded
CRS-5017: The resource action "ora.cluster_interconnect.haip start" encountered the following error:
Start action for HAIP aborted. For details refer to "(:CLSN00107:)" in "/oracle/app/11.2.0/grid/log/gzsj2/agent/ohasd/orarootagent_root//orarootagent_root.log".
CRS-2674: Start of 'ora.cluster_interconnect.haip' on 'gzsj2' failed
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'gzsj2'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'gzsj2' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'gzsj2'
CRS-2677: Stop of 'ora.ctssd' on 'gzsj2' succeeded
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'gzsj2'
CRS-2677: Stop of 'ora.drivers.acfs' on 'gzsj2' succeeded
CRS-4000: Command Start failed, or completed with errors.

#如果asm资源正常,还得先停止asm资源。由于本故障asm已经完全无法启动了,所以省去此步骤:crsctl stop res ora.asm -init

#禁用haip,修改asm依赖关系

[root@gzsj2 ~]# /oracle/app/11.2.0/grid/bin/crsctl modify res ora.cluster_interconnect.haip -attr "ENABLED=0" -init
[root@gzsj2 ~]# /oracle/app/11.2.0/grid/bin/crsctl modify res ora.asm -attr "START_DEPENDENCIES='hard(ora.cssd,ora.ctssd)pullup(ora.cssd,ora.ctssd)weak(ora.drivers.acfs)',STOP_DEPENDENCIES='hard(intermediate:ora.cssd)'" -init

#restart crs

[root@gzsj2 ~]# /oracle/app/11.2.0/grid/bin/crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'gzsj2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'gzsj2'
CRS-2673: Attempting to stop 'ora.cssd' on 'gzsj2'
CRS-2677: Stop of 'ora.cssd' on 'gzsj2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'gzsj2'
CRS-2677: Stop of 'ora.mdnsd' on 'gzsj2' succeeded
CRS-2677: Stop of 'ora.gipcd' on 'gzsj2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'gzsj2'
CRS-2677: Stop of 'ora.gpnpd' on 'gzsj2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'gzsj2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@gzsj2 ~]# /oracle/app/11.2.0/grid/bin/crsctl start crs
CRS-4123: Oracle High Availability Services has been started.


3.检查资源状态

可以看到asm资源已经online,且haip资源offline。



4.监听的处理

上述工作后,监听无法动态注册。


直接删掉listener.ora

mv /oracle/app/11.2.0/grid/network/admin/listener.ora /oracle/app/11.2.0/grid/network/admin/listener.oraba

或者配置静态监听也可以。

重启监听并手动register  asm以及db实例。


over.

PS:对于正常的rac集群,如果有禁用haip需要,必须所有的节点都禁用,不然启动db可能会报错

最后修改时间:2022-07-01 10:10:20
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论