点击上方“IT那活儿”公众号,关注后了解更多内容,不管IT什么活儿,干就完了!!!


问题故障原因定位
从错误日志分析,错误原因的数据库连接池的连接达到最大设定值,导致系统无法获取新的数据库连接导致。
数据库连接池连接最大值设置为200,如没有特殊情况发生,系统不会出现连接数不足的现象。按以往的经验造成连接数不足现象的原因有以下几个:
系统发生数据库死锁导致大量连接被锁定; 数据库系统超负荷运转导致反应变慢; 网络连接问题导致无法连接数据库; 系统有大数据量操作导致; 系统程序有未释放连接的程序bug,导致连接被使用光; 数据库其他问题导致无法获取连接。
通过排查该应用系统各个层面的影响因素,最终将故障原因定位在连接池上。
问题分析过程
1. 首先检查数据库是否有死锁
[7/27/14 15:51:54:938 GMT+08:00] 000000db AcfCoreServle E com.ursa.acf.plugins.todomessage.impl.DealReadToDoMsgMgr list error.plugin.user.error_sqlerror
com.ursa.acf.plugins.user.UserPluginException: error.plugin.user.error_sqlerror
at com.ursa.acf.plugins.user.ursaimpl.UrsaUser.j(UrsaUser.java:646)
at com.ursa.acf.plugins.user.ursaimpl.UrsaUser.<init>(UrsaUser.java:53)
at com.ursa.acf.plugins.user.ursaimpl.UrsaUserManager.getUser(UrsaUserManager.java:178)
at com.ursa.acf.plugins.user.UserHelper.getUser(UserHelper.java:52)
at com.ursa.acf.plugins.todomessage.impl.DealReadToDoMsgMgr.list(DealReadToDoMsgMgr.java:132)
at com.itss.xz.test.login.AcfPanel_DBSY.setPanel(AcfPanel_DBSY.java:265)
at com.ursa.acf.plugins.acfpage.acfpanel.AcfPanelShow.acfShow(AcfPanelShow.java:65)
at com.ursa.acf.plugins.acfshow.AcfShow.acfExecute(AcfShow.java:38)
at com.ursa.acf.core.servlet.AcfCoreAction.execute(AcfCoreAction.java:62)
at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:419)
at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:224)
at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1194)
at org.apache.struts.action.ActionServlet.doGet(ActionServlet.java:414)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:575)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:668)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.service(ServletWrapper.java:1147)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:722)
at com.ibm.ws.webcontainer.servlet.ServletWrapper.handleRequest(ServletWrapper.java:449)
at com.ibm.ws.webcontainer.servlet.ServletWrapperImpl.handleRequest(ServletWrapperImpl.java:178)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.invokeTarget(WebAppFilterChain.java:125)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:92)
at com.ursa.acf.core.EncodingFilter.doFilter(EncodingFilter.java:46)
at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:192)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:89)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:919)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1016)
at com.ibm.ws.webcontainer.servlet.CacheServletWrapper.handleRequest(CacheServletWrapper.java:87)
at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:883)
at com.ibm.ws.webcontainer.WSWebContainer.handleRequest(WSWebContainer.java:1659)
at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:195)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:452)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:511)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:305)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready(HttpInboundLink.java:276)
at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.sendToDiscriminators(NewConnectionInitialReadCallback.java:214)
at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.complete(NewConnectionInitialReadCallback.java:113)
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
at com.ibm.io.async.AsyncChannelFuture$1.run(AsyncChannelFuture.java:205)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1648)
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at com.ursa.acf.component.datasource.impl.StrutsDSManager.getConnection(StrutsDSManager.java:107)
at com.ursa.acf.component.datasource.DatabaseHelper.getConnection(DatabaseHelper.java:42)
at com.ursa.acf.plugins.user.ursaimpl.UrsaUser.j(UrsaUser.java:629)
... 39 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1144)
at org.apache.commons.dbcp.AbandonedObjectPool.borrowObject(AbandonedObjectPool.java:79)
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
... 43 more
2节点:
[7/28/14 10:41:32:109 GMT+08:00] 00004676 filter E com.ibm.ws.webcontainer.filter.FilterInstanceWrapper service SRVE8109W: Uncaught exception thrown by filter EncodingFilter: java.io.FileNotFoundException: SRVE0190E: File not found: xzpage/bjyc/fs/resources/styles/ursadefault.css
at com.ibm.ws.webcontainer.extension.DefaultExtensionProcessor.handleRequest(DefaultExtensionProcessor.java:443)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.invokeTarget(WebAppFilterChain.java:125)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:92)
at com.ursa.acf.core.EncodingFilter.doFilter(EncodingFilter.java:46)
at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:192)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:89)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:919)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1016)
at com.ibm.ws.webcontainer.webapp.WebApp.handleRequest(WebApp.java:3639)
at com.ibm.ws.webcontainer.webapp.WebGroup.handleRequest(WebGroup.java:304)
at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:950)
at com.ibm.ws.webcontainer.WSWebContainer.handleRequest(WSWebContainer.java:1659)
at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:195)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:452)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:511)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:305)
at com.ibm.ws.http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:83)
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:138)
at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:204)
at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:775)
at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:905)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1648)
1节点:
[7/28/14 8:47:21:557 GMT+08:00] 00000036 filter E com.ibm.ws.webcontainer.filter.FilterInstanceWrapper service SRVE8109W: Uncaught exception thrown by filter EncodingFilter: java.io.FileNotFoundException: SRVE0190E: File not found: xzdefault.css
at com.ibm.ws.webcontainer.extension.DefaultExtensionProcessor.handleRequest(DefaultExtensionProcessor.java:443)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.invokeTarget(WebAppFilterChain.java:125)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:92)
at com.ursa.acf.core.EncodingFilter.doFilter(EncodingFilter.java:46)
at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:192)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:89)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:919)
at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1016)
at com.ibm.ws.webcontainer.webapp.WebApp.handleRequest(WebApp.java:3639)
at com.ibm.ws.webcontainer.webapp.WebGroup.handleRequest(WebGroup.java:304)
at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:950)
at com.ibm.ws.webcontainer.WSWebContainer.handleRequest(WSWebContainer.java:1659)
at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:195)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:452)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:511)
at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:305)
at com.ibm.ws.http.channel.inbound.impl.HttpICLReadCallback.complete(HttpICLReadCallback.java:83)
at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165)
at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217)
at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161)
at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:138)
at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:204)
at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:775)
at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:905)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1648)
应用调用数据库使用的常连接,意味着即使用户没有在做业务的情况下,也一直不释放而占用着数据库连接的资源,也会影响别的用户访问数据库。
系统优化整改要求
根据连接池问题,系统优化整改要求如下:
1. 增加Apache上的最大连接数
2. 大数据量操作的优化建议
对大数据量操作增加操作日志,以及对可能发生大数据量操作的程序进行以下优化。
收文查询打印功能限制最大打印数量(小于1000条),超过数量会在打印的文档最后输出“打印数量超过1000条请优化查询后再进行打印”; 收文查询记录查询日志,记录查询人、查询时间、查询sql等信息; 归档操作记录日志,记录归档人、归档时间、归档条数等日志; 对收文查询、归档操作实现屏蔽重复操作的功能(屏蔽双击或多次快速点击操作)。
3. 建议查看如下方法中的代码问题
A) at com.ursa.acf.core.EncodingFilter.doFilter(EncodingFilter.java:46)
中doFilter(EncodingFilter.java:46)方法中的46行代码
B) com.ursa.acf.component.datasource.impl.StrutsDSManager.getConnection
(StrutsDSManager.java:107)
中getConnection()方法中的107行代码
C) com.ursa.acf.component.datasource.DatabaseHelper.getConnection
(DatabaseHelper.java:42)
中getConnection(DatabaseHelper.java:42) 方法中的42行代码
4. 依据用户的应用使用习惯进行适当调整
5. Websphere应用服务器的优化
6. 升级WAS主机的操作系统及中间件
7. 升级前端Apache分发机制的负载均衡功能
主要有两个原因:
硬件负载均衡设备可根据连接数、后台服务器的负载压力判断实现真正的负载均衡;而Apache基本上是采用轮询的方式。 另外用apache、tomcat这种部署方式导致主分发节点成为了单点,一旦这个节点出现故障,整个系统就会瘫痪;而硬件负载均衡设备可以实现每个节点均可充当主节点,大大提升系统可靠性和高可用性。
强烈建议:如系统后续有升级改造,在资金宽裕的情况下,采用硬件负载均衡设备。
应急预案
数据抓取内容如下:
连接数抓取,用于分析数据库连接使用情况。 执行sql如下: select username,machine,count(username) from v$session where username is not null and username <> 'SYS' group by machine,username order by machine 分别在数据库服务器1和2上执行。 应用服务器日志抓取,分析程序错误。当前日志保存5个历史文件,如有问题发生很有可能会大量产生日志,导致前面的日志被冲掉,所以需要加大日志保存的数量和时间。 数据库锁情况抓取。 应用服务器内存快照抓取,用于分析问题发生时系统正在运行的程序。 应用服务器内存、cpu使用情况抓取。

本文作者:李松桦(上海新炬王翦团队)
本文来源:“IT那活儿”公众号

文章转载自IT那活儿,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




