
作者:任仲禹,爱可生数据库工程师,擅长故障分析和性能优化。
爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。
本文约 1900 字,预计阅读需要 8 分钟。

故障背景
某客户生产环境 OceanBase(Oracle 模式)数据库 -> Oracle 增量同步链路(DDL + DML)延迟过大,Incr-Sync 组件 位点无法推进,位点卡在 2025-04-29 12:08:25
左右,增加了业务对于同步数据读取延迟风险。
OMS 迁移工具页面中的故障现象如下(延迟时间持续增大)。

分析过程
检查组件 connector.log 日志
登录 Incr-Sync 组件[1] 所在服务器容器内,发现日志中在反复执行一条创建索引 DDL 并且报错 java.sql.SQLRecoverableException: IO Error: Socket read timed out
。
[2025-04-29 14:48:01.395] [WARN] [sinkTask-9] [Need reset connection oms record [meta[prd_dbarenzy.GDBARENZYU-BFSDRENZYRENZYRENZYRENZY,1745899708,DDL,1745899708]post[Struct{ddl(VAR_CHAR_STRING)=CREATE INDEX "IDX_JY
MX_UNIT_STATUS" ON "BFSDRENZYRENZYRENZYRENZY"("ACCOUNRENZY","TRANSSRENZY"),env(VAR_CHAR_STRING)=1745899708724080,ddlType(VAR_STRING)=CREATE_INDEX,srcDatabase(VAR_STRING)=prd_dbarenzy.GDBARENZYU,srcTable(VAR_STRING)
=BFSDRENZYRENZYRENZYRENZY,destDatabase(VAR_STRING)=GDBARENZYU,destTable(VAR_STRING)=BFSDRENZYRENZYRENZYRENZY,originalDDL(VAR_STRING)=CREATE INDEX IDX_JYMX_UNIT_STATUS on BFSDRENZYRENZYRENZYRENZY(ACCOUNRENZY,TRANSSTA
TUS);,affectObjects(LIST)={"db":"GDBARENZYU","table":"BFSDRENZYRENZYRENZYRENZY"}}]], exception message [java.sql.SQLRecoverableException: IO Error: Socket read timed out], errorCode [17002], sqlState [08006]]
[2025-04-29 14:48:01.396] [WARN] [sinkTask-9] [retry for exception]
java.sql.SQLRecoverableException: IO Error: Socket read timed out
at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:993)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1205)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:3666)
at oracle.jdbc.driver.T4CPreparedStatement.executeInternal(T4CPreparedStatement.java:1426)
at oracle.jdbc.driver.OraclePreparedStatement.executeLargeUpdate(OraclePreparedStatement.java:3756)
at oracle.jdbc.driver.OraclePreparedStatement.executeUpdate(OraclePreparedStatement.java:3736)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.executeUpdate(OraclePreparedStatementWrapper.java:1063)
at com.alibaba.druid.filter.FilterChainImpl.preparedStatement_executeUpdate(FilterChainImpl.java:3253)
at com.alibaba.druid.filter.FilterAdapter.preparedStatement_executeUpdate(FilterAdapter.java:1092)
at com.alibaba.druid.filter.FilterEventAdapter.preparedStatement_executeUpdate(FilterEventAdapter.java:491)
at com.alibaba.druid.filter.FilterChainImpl.preparedStatement_executeUpdate(FilterChainImpl.java:3251)
at com.alibaba.druid.proxy.jdbc.PreparedStatementProxyImpl.executeUpdate(PreparedStatementProxyImpl.java:194)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.executeUpdate(DruidPooledPreparedStatement.java:255)
at com.oceanbase.oms.connector.jdbc.sink.Writer.writeDDL(Writer.java:641)
at com.oceanbase.oms.connector.jdbc.sink.Writer.executeInRetry(Writer.java:480)
at com.oceanbase.oms.connector.jdbc.sink.Writer.flushDDL(Writer.java:435)
at com.oceanbase.oms.connector.jdbc.sink.Writer.flushRecords(Writer.java:182)
at com.oceanbase.oms.connector.jdbc.sink.DefaultJDBCSink.offer(DefaultJDBCSink.java:60)
at com.oceanbase.connector.framework.threadmanager.sinktask.SyncSinkConnectorTask.run(SyncSinkConnectorTask.java:66)
at java.lang.Thread.run(Thread.java:750)
Caused by: oracle.net.nt.TimeoutInterruptHandler$IOReadTimeoutException: Socket read timed out
at oracle.net.nt.TimeoutSocketChannel.handleInterrupt(TimeoutSocketChannel.java:254)
at oracle.net.nt.TimeoutSocketChannel.read(TimeoutSocketChannel.java:180)
at oracle.net.ns.NSProtocolNIO.doSocketRead(NSProtocolNIO.java:555)
at oracle.net.ns.NIOPacket.readHeader(NIOPacket.java:258)
at oracle.net.ns.NIOPacket.readPacketFromSocketChannel(NIOPacket.java:190)
at oracle.net.ns.NIOPacket.readFromSocketChannel(NIOPacket.java:132)
at oracle.net.ns.NIOPacket.readFromSocketChannel(NIOPacket.java:105)
at oracle.net.ns.NIONSDataChannel.readDataFromSocketChannel(NIONSDataChannel.java:91)
at oracle.net.ano.CryptoNIONSDataChannel.readDataFromSocketChannel(Unknown Source)
at oracle.jdbc.driver.T4CMAREngineNIO.prepareForUnmarshall(T4CMAREngineNIO.java:784)
at oracle.jdbc.driver.T4CMAREngineNIO.unmarshalUB1(T4CMAREngineNIO.java:449)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:410)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:269)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:655)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:270)
at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:91)
at oracle.jdbc.driver.T4CPreparedStatement.executeForRows(T4CPreparedStatement.java:970)
... 19 common frames omitted
[2025-04-29 14:48:01.396] [WARN] [sinkTask-9] [rollback txn meet exception [{}], ignore it]
java.sql.SQLRecoverableException: Closed Connection
at oracle.jdbc.driver.PhysicalConnection.getAutoCommit(PhysicalConnection.java:2089)
at oracle.jdbc.driver.PhysicalConnection.rollback(PhysicalConnection.java:2218)
at com.alibaba.druid.filter.FilterChainImpl.connection_rollback(FilterChainImpl.java:714)
at com.alibaba.druid.filter.FilterAdapter.connection_rollback(FilterAdapter.java:974)
at com.alibaba.druid.filter.FilterChainImpl.connection_rollback(FilterChainImpl.java:709)
at com.alibaba.druid.proxy.jdbc.ConnectionProxyImpl.rollback(ConnectionProxyImpl.java:399)
at com.alibaba.druid.pool.DruidPooledConnection.rollback(DruidPooledConnection.java:814)
at com.oceanbase.oms.connector.jdbc.sink.Writer.rollbackConnection(Writer.java:698)
at com.oceanbase.oms.connector.jdbc.sink.Writer.handleRetry(Writer.java:708)
at com.oceanbase.oms.connector.jdbc.sink.Writer.executeInRetry(Writer.java:509)
at com.oceanbase.oms.connector.jdbc.sink.Writer.flushDDL(Writer.java:435)
at com.oceanbase.oms.connector.jdbc.sink.Writer.flushRecords(Writer.java:182)
at com.oceanbase.oms.connector.jdbc.sink.DefaultJDBCSink.offer(DefaultJDBCSink.java:60)
at com.oceanbase.connector.framework.threadmanager.sinktask.SyncSinkConnectorTask.run(SyncSinkConnectorTask.java:66)
at java.lang.Thread.run(Thread.java:750)
[2025-04-29 14:48:01.396] [ERROR] [sinkTask-9] [recyle error]
java.sql.SQLRecoverableException: Closed Connection
at oracle.jdbc.driver.PhysicalConnection.setAutoCommit(PhysicalConnection.java:2067)
at com.alibaba.druid.filter.FilterChainImpl.connection_setAutoCommit(FilterChainImpl.java:738)
at com.alibaba.druid.filter.FilterAdapter.connection_setAutoCommit(FilterAdapter.java:986)
at com.alibaba.druid.filter.FilterChainImpl.connection_setAutoCommit(FilterChainImpl.java:733)
at com.alibaba.druid.proxy.jdbc.ConnectionProxyImpl.setAutoCommit(ConnectionProxyImpl.java:429)
at com.alibaba.druid.pool.DruidConnectionHolder.reset(DruidConnectionHolder.java:312)
at com.alibaba.druid.pool.DruidDataSource.recycle(DruidDataSource.java:1949)
at com.alibaba.druid.pool.DruidPooledConnection.recycle(DruidPooledConnection.java:351)
at com.alibaba.druid.filter.FilterChainImpl.dataSource_recycle(FilterChainImpl.java:5049)
at com.alibaba.druid.filter.FilterAdapter.dataSource_releaseConnection(FilterAdapter.java:2750)
at com.alibaba.druid.filter.FilterChainImpl.dataSource_recycle(FilterChainImpl.java:5045)
at com.alibaba.druid.pool.DruidPooledConnection.close(DruidPooledConnection.java:286)
at com.oceanbase.oms.connector.jdbc.sink.AbstractDruidDataSourceBuilder.closeConnection(AbstractDruidDataSourceBuilder.java:47)
at com.oceanbase.oms.connector.jdbc.sink.Writer.closeConnection(Writer.java:752)
at com.oceanbase.oms.connector.jdbc.sink.Writer.executeInRetry(Writer.java:530)
at com.oceanbase.oms.connector.jdbc.sink.Writer.flushDDL(Writer.java:435)
at com.oceanbase.oms.connector.jdbc.sink.Writer.flushRecords(Writer.java:182)
at com.oceanbase.oms.connector.jdbc.sink.DefaultJDBCSink.offer(DefaultJDBCSink.java:60)
at com.oceanbase.connector.framework.threadmanager.sinktask.SyncSinkConnectorTask.run(SyncSinkConnectorTask.java:66)
at java.lang.Thread.run(Thread.java:750)
检查 Oracle 中数据量
表数据量 3400 万条左右,索引创建语句确实需要一定时间。

检查 connector_sink_msg.log
OMS 的 Incr-sync 组件 的消费端(sink)日志迟迟无新的 DML 记录被消费(一直卡在 12:08:30 秒左右),基本能确定其他 DML 被这条 DDL 阻塞了。
本例与 OceanBase 官网上的一篇文章[2] 描述相同,但结论(出现该问题,您需要耐心等待 DDL 执行结束)在本例中无效,因为该条 DDL 一直是反复执行报错。

参数 oracleSocketTimeout 的问题
因为 connector.log
日志中报错是 java.sql.SQLRecoverableException: IO Error: Socket read timed out
,该报错按过往经验其实就是:OMS 的 Incr-sync 组件 与目标端 Oracle 数据库建立的 JDBC 连接因 SQL 执行超过 SocketTimeout 阈值而断链。
大致推测出原因,我们就可以分析下 OMS 官网上 incr-sync 组件 与 SocketTimeout 参数相关内容[3],看是否有线索。 确实存在参数 oracleSocketTimeout ,默认 50s 超时,本例中调大尝试能否解决问题。

修改方式
更新 Incr-sync 组件,在 sink 模块中,增加 oracleSocketTimeout 参数,并设置 1800s 超时。

检查参数生效
修改完成后,connector.log
日志会打印 Incr-sync 和 Oracle 间建立的 JDBC 参数。
jdbc:oracle:thin:@//10.19.51.23:1521/zdbrenzy.actionskydba.com ?rewriteBatchedStatements=true&GetConnectionTimeout=20000&socketTimeout=1800000
105[2025-04-29 15:00:08.377] [INFO] [main] [[Init DruidDataSource]configInitSQLS: []]
106[2025-04-29 15:00:08.378] [INFO] [main] [connect with url [jdbc:oracle:thin:@//10.19.51.23:1521/zdbrenzy.actionskydba.com ?rewriteBatchedStatements=true&GetConnectionTimeout=20000&socketTimeout=1800000&connectTimeout=30000&allowMultiQueries=true&useLocalSessionState=true]]
检查 DDL 执行完成
等待一段时间,incr-sync 的 msg/ddl_msg.log
日志显示该 DDL 成功被消费。
1[2025-04-29 15:00:17.560] [forward_slot0-(ETLProcessor)-queue_slot1] receive ddl:CREATE INDEX IDX_JYMX_UNIT_STATUS on BFSDRENZYRENZYRENZYRENZY(ACCOUNRENZY,TRANSSRENZY);, checkpoint[1745899707]
2[2025-04-29 15:00:19.041] [forward_slot0-(ETLProcessor)-queue_slot1] DDLTransfer: transfer ddl [CREATE INDEX IDX_JYMX_UNIT_STATUS on BFSDRENZYRENZYRENZYRENZY(ACCOUNRENZY,TRANSSRENZY);] to ddl [[CREATE INDEX "IDX_JYMX_UNIT_STATUS" ON "BFSDRENZYRENZYRENZYRENZY"("ACCOUNRENZY","TRANSSRENZY")]]
3[2025-04-29 15:01:48.953] [sinkTask-30] set sql:alter session set current_schema="GDBARENZYU", recordDb:GDBARENZYU
Oracle 中索引创建成功。

界面组件监控状态可正常推进,问题解决。

原因和结论
本例故障中 DDL 超时阻塞 OMS 的 Incr-Sync 增量同步组件位点推进的直接原因是 DDL 反复执行失败导致;
但根本原因是 oracleSocketTimeout 参数默认值太小(50s)、表的数据量过大,从而使 Incr-sync 组件与目标端 Oracle 数据库建立的 JDBC 连接因 DDL-SQL 执行超过 SocketTimeout 阈值而断链。
该案例在生产实践中无法避免,一是业务表的数据量无法控制,二是不好预估 DDL 如列表表更、增添索引的执行耗时;建议的优化是:
Incy-Sync 增加 oracleSocketTimeout 参数,并将添加该参数列入运维规范(修改方式参考上文章节)。 修改参数默认值 50s -> 600s (具体值依实际环境而定)。
Incr-Sync 组件: https://www.oceanbase.com/docs/community-oms-cn-1000000003497874
[2]官网参考文章1: https://www.oceanbase.com/knowledge-base/oms-ee-1000000000253329
[3]官网参考文章2: https://www.oceanbase.com/docs/enterprise-oms-doc-cn-1000000000613559
本文关键字:#OceanBase #OMS


✨ Github:https://github.com/actiontech/sqle
📚 文档:https://actiontech.github.io/sqle-docs/
💻 官网:https://opensource.actionsky.com/sqle/
👥 微信群:请添加小助手加入 ActionOpenSource
🔗 商业支持:https://www.actionsky.com/sqle





