Applies to:
Oracle Database - Enterprise Edition - Version 12.1.0.2 and later
Information in this document applies to any platform.
Symptoms
CRSD fails to start during the crs startup.
crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4535: Cannot communicate with Cluster Ready Services====>
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
crsctl stat res -t -init
--------------------------------------------------------------------------------
Name Target State Server State details
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
1 ONLINE ONLINE node1 Started,STABLE
ora.cluster_interconnect.haip
1 ONLINE ONLINE node1 STABLE
ora.crf
1 ONLINE ONLINE node1 STABLE
ora.crsd
1 ONLINE OFFLINE STABLE===========================>crsd offline
ora.cssd
1 ONLINE ONLINE node1 STABLE
ora.cssdmonitor
From crsd.trc:
=========
2018-09-07 20:42:57.225607 : OCRMAS:2390718208: th_calc_av: Configured Active Patch Level [0]
2018-09-07 20:42:57.225617 : OCRMAS:2390718208: th_calc_av:5'': Return persisted APL [0]
OCRMAS:2390718208: th_calc_av:5': Return persisted AV [202375680] [12.1.0.2.0]
2018-09-07 20:42:57.228610 : OCRMAS:2390718208: th_master_prereg: Persistent upgrade state retrieved from OCR is [0].
2018-09-07 20:42:57.232154 : OCRMAS:2390718208: th_master_prereg: Persistent upgrade toversion buffer retrieved from OCR is [12.1.0.2.0]. Setting toversion to [202375680].
2018-09-07 20:42:58.124658 : CSSCLNT:2390718208: clssgsGroupJoin: member in use group(1/ocrlocal)
2018-09-07 20:42:58.124811 : default:2390718208: procr_reg_localgrp: Error [14] from clssgsreglocalgrp(). Return [23].
2018-09-07 20:42:58.124829 : default:2390718208: SLOS : [clsuSlosFormatDiag called with non-error slos.]
2018-09-07 20:42:58.124835 : OCRMAS:2390718208: th_master_register: Failed to register in OCRLOCAL group. Retval:[23]
2018-09-07 20:42:58.124848 : OCRAPI:2390718208: procr_ctx_set_invalid: ctx is in state [6].
2018-09-07 20:42:58.124851 : OCRAPI:2390718208: procr_ctx_set_invalid: ctx set to invalid ======================================>invalid state
2018-09-07 20:42:58.125034 : OCRAPI:2390718208: procr_ctx_set_invalid: Aborting...=============>
2018-09-07 20:43:09.296567 :GIPCHTHR:331339520: gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 30990loopCount 32
2018-09-07 20:43:27.315333 :GIPCHTHR:329238272: gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 31000loopCount 41
2018-09-07 20:43:39.329731 :GIPCHTHR:331339520: gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 30030loopCount 30
2018-09-07 20:43:55.277525 :UiServer:3204413184: {1:46253:2} Sending to PE. ctx= 0x7f0480266040, ClientPID=38508, tint: {1:46251:2}
2018-09-07 20:43:55.322378 :UiServer:3204413184: {1:46253:2} Done for ctx=0x7f0480266040
2018-09-07 20:43:57.373864 :GIPCHTHR:329238272: gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 30050loopCount 38
2018-09-07 20:44:10.335373 :GIPCHTHR:331339520: gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 31000loopCount 33
2018-09-07 20:44:28.355553 :GIPCHTHR:329238272: gipchaDaemonWork: DaemonThread heart beat, time interval since last heartBeat 30980loopCount 37
2018-09-07 20:44:40.367497 :GIPCHTHR:331339520: gipchaWorkerWork: workerThread heart beat, time interval since last heartBeat 30030loopCount 30
2018-09-07 20:44:55.252774 :UiServer:3204413184: {1:46253:2} Sending to PE. ctx= 0x7f0480265f70, ClientPID=38508, tint: {1:46251:2}
Changes
Cause
Previous crsd.bin process is still persist in the server as below and this is causing the issue while bringing up the new crsd as part of crs startup in the server
Node1:
Node1 <oracle:+ASM1>:/u01/app/grid/12.1.0.2/infra=> ps -ef |grep crsd
root 14326 1 6 23:06 ? 00:00:00 /u01/app/grid/12.1.0.2/infra/bin/crsd.bin reboot==========>
oracle 14931 59862 0 23:06 pts/1 00:00:00 grep crsd
root 32592 1 0 Aug05 ? 06:36:26 /u01/app/grid/12.1.0.2/infra/bin/crsd.bin reboot=============>stale crs'tasd process
Solution
Kill the stale crsd process and then restart the crsd daemon.
Node1 <oracle:+ASM1>:/u01/app/grid/12.1.0.2/infra=> kill -9 32592
crsctl stop res ora.crsd -init -f
crsctl start res ora.crsd -init -f




