故障现象:
前端反馈数据库无法连接
[ERROR]-[Thread: Druid-ConnectionPool-Create-26728049]-[com.alibaba.druid.pool.DruidDataSource$CreateConnectionThread.run()]: create connection error, url: jdbc:oracle:thin:@x.x.x.93:1521:empdb011, errorCode 17002, state 08006
java.sql.SQLRecoverableException: IO 错误: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:774)
at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:688)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:39)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:691)
at com.alibaba.druid.filter.FilterChainImpl.connection_connect(FilterChainImpl.java:148)
at com.alibaba.druid.filter.stat.StatFilter.connection_connect(StatFilter.java:220)
at com.alibaba.druid.filter.FilterChainImpl.connection_connect(FilterChainImpl.java:142)
at com.alibaba.druid.filter.FilterAdapter.connection_connect(FilterAdapter.java:785)
at com.alibaba.druid.filter.FilterChainImpl.connection_connect(FilterChainImpl.java:142)
at com.alibaba.druid.pool.DruidAbstractDataSource.createPhysicalConnection(DruidAbstractDataSource.java:1463)
at com.alibaba.druid.pool.DruidAbstractDataSource.createPhysicalConnection(DruidAbstractDataSource.java:1525)
at com.alibaba.druid.pool.DruidDataSource$CreateConnectionThread.run(DruidDataSource.java:2100)
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:523)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:521)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:660)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:286)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1438)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:518)
... 11 more
Caused by: java.io.IOException: Connection timed out: connect, socket connect lapse 20998 ms. /x.x.x.93 1521 0 1 true
at ora
处理分析过程:
紧急登录一体机环境
1.检查主机日志及是否重启,发现所有机器都重启过,可能断电了

2.检查集群状态,
登录2个数据库节点,ps -ef|grep d.bin|wc -l只有11个进程,crs服务异常

3.检查crs日志:心跳网络不正常
/u01/app/grid/diag/crs/rac1/crs/trace/alert.log

/u01/app/grid/diag/crs/rac1/crs/trace/ocssd.trc

4.检查心跳网络:
本机ping自己的私网ip正常,但ping其它机器心跳IP不通:

登录2台IB交换机检查状态:ssh 交换机IP,提示如下:

boot成功后,登录交换机检查:


心跳网络此时恢复正常。
5.检查3个节点存储服务器磁盘状态,griddisk无法查看

启动3个节点的cell服务,确保griddisk active状态


当心跳网络不通时,启动报错:

6.重启2个数据库节点的集群服务

3分钟后数据库集群恢复正常。应用检查恢复正常。
7.现场登录ilom检查原因,发现电源切换过

8.相关参考:

https://www.modb.pro/db/61156
https://www.talkwithtrend.com/Article/177765
https://blog.csdn.net/weixin_41607523/article/details/134125667
最后修改时间:2023-11-01 09:44:16
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




