适用范围
Oracle Database 19.3 +
SUSE Linux Enterprise Server 12 SP4
EMC存储
问题概述
Oracle 19c RAC 节点2 操作系统由SUSE Linux 12 SP4升级到SP5后,重启操作系统,该节点存储无法识别,cssd进程无法启动到real time 模式,crs无法正常启动。
问题原因
1、操作系统安全加固产品导致19c集群的CSSD进程无法启动real time模式加固产品调整了操作系统CPU Accounting。请参考Oracle 12c RAC CSSD进程无法启动real time模式。
解决方案
1、关闭操作系统安全加固软件的服务;
2、安装SUSE 12 SP5匹配的EMC存储管理软件DellEMCPower.LINUX-7.1.0.00.00-075.SLES12SP5.x86_64.rpm
分析过程
1、启动节点2 crs启动时的alert日志
2025-03-27 20:17:50.802 [OCSSD(14861)]CRS-1714: Unable to discover
any voting files, retrying discovery in 15 seconds; Details at
(:CSSNM00070:)in /u01/app/grid/diag/crs/host01/crs/trace/ocssd.trc
节点2的crs启动时alert日志显示无法找到vote盘,15秒重试,仍然无法找到vote盘。
2、节点2 ocssd日志
2025-03-27 20:17:50.801 : CSSD:3324163840: clssnmReadDiscoveryProfile: voting file discovery string(/dev/emcpower)
2025-03-27 20:17:50.801 : CSSD:3324163840: clssnmvDDiscThread: using discovery string /dev/oracleasm/disks for initial discovery
2025-03-27 20:17:50.801 : SKGFD:3324163840: Discovery with str:/dev/emcpower:
2025-03-27 20:17:50.801 : SKGFD:3324163840: UFS discovery with :/dev/emcpower:
2025-03-27 20:17:50.801 : SKGFD:3324163840: Execute glob on the string /dev/emcpower
2025-03-27 20:17:50.801 : SKGFD:3324163840: OSS discovery with :/dev/emcpower
2025-03-27 20:17:50.802 : SKGFD:3324163840: Discovery skipping bad :ASM::
2025-03-27 20:17:50.802 : CSSD:3324163840: clssnmvDiskVerify: Successful discovery of 0 disks
2025-03-27 20:17:50.802 : CSSD:3324163840: clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2025-03-27 20:17:50.802 : CSSD:3324163840: clssnmvFindInitialConfigs: No voting files found
2025-03-27 20:17:50.802 : CSSD:3324163840: (:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds
2025-03-27 20:17:51.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2025-03-27 20:17:52.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2025-03-27 20:17:53.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2025-03-27 20:17:54.687 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2025-03-27 20:17:55.688 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2025-03-27 20:17:56.688 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
2025-03-27 20:17:57.688 : CSSD:3558553344: clsssc_CLSFAInit_CB: System not ready for CLSFA initialization
cssd日志显示无法找到/dev/emcpower存储,clsssc_CLSFAInit_CB初始化失败,cssd进程无法正常启动。
3、集群状态检查
$crsctl stat res -t -init
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.asm
2 ONLINE OFFLINE
ora.cluster_interconnect.haip
2 ONLINE OFFLINE
ora.crf
2 ONLINE ONLINE fadb02
ora.crsd
2 ONLINE OFFLINE
ora.cssd
2 ONLINE OFFLINE STARTING
ora.cssdmonitor
2 ONLINE ONLINE fadb02
ora.ctssd
2 ONLINE OFFLINE
ora.diskmon
2 OFFLINE OFFLINE
ora.evmd
2 ONLINE OFFLINE
ora.gipcd
2 ONLINE ONLINE fadb02
ora.gpnpd
2 ONLINE ONLINE fadb02
ora.mdnsd
2 ONLINE ONLINE fadb02
cssd进程是starting状态,因为vote无法找到,所以cssd进程无法正常启动。
4、存储检查
节点2:
$ls -l /dev/emcpower*
没有返回值,节点2无法识别存储。
节点1:
$ls -l /dev/emcpower*
/dev/emcpowera
/dev/emcpowerb
/dev/emcpowerc
/dev/emcpowere
节点1的存储盘正常。
4、检查节点2的存储
$powermt display dev=all
initialization error
$rpm -qa | grep EMCpower
EMCpower.LINUX-7.0.0.00.00-064.suse12sp4.x86_64
节点2操作系统升级到SUSE12 SP5后EMCpower软件仍然是sp4的,初步判断EMCpower与操作系统不兼容导致EMC存储无法识别
5、安装EMC匹配的软件
下载并安装DellEMCPower.LINUX-7.1.0.00.00-075.SLES12SP5.x86_64.rpm,节点2 存储可以正常识别,重新启动crs可以正常启动。
总结
数据库基础环境生产环境变更,一定要经过严格测试后在实施,OS和数据库升级等操作,应考虑该服务器上其他软件的兼容性。
-the end-




