canal一个小调整，负载直接下来了

虞大胆的叽叽喳喳 2022-02-15

766

最近给canal增加了三个instance，负载直接上百了，虽然以前负载也很高，但现在不能不想办法解决了。

什么是Linux中的负载，见下面的引用：

System load averages is the average number of processes that are either in a runnable or uninterruptable state. A process in a runnable state is either using the CPU or waiting to use the CPU. A process in uninterruptable state is waiting for some I/O access, eg waiting for disk.

系统负载高相当于系统可运行的进程数太多了，通过vmstat也能验证这一情况。

CPU利用率也非常高，比如运行 vmstat 或者 mpstat -P ALL 都能验证这一情况，同时也能看出 iowait 并不高，usr 也并不是非常高？那是什么原因导致负载高呢？

同时说下 mpstat 中对 %iowait 的解释：

Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request

对于这个案例来说，我是明确知道 canal 是最大的瓶颈所在，那用 pidstat 看看：

pidstat -ut 5 1 -p 23652

其中注意%wait指标，解释：

Percentage of CPU spent by the task while waiting to run.

说明有等待其他CPU的情况，同时上下文切换也并不是有巨大的变化：

pidstat -wt 3 -p 9094

尤其并没有太多的非自愿上下文切换，所以有点郁闷了。

考虑到每增加一个instance，就会增加一个从库dump线程，该canal实例中大概有14个instance，通过mysql的show full processlist也能验证。

再仔细看看 canal.instance.parser.parallelThreadSize 参数，配置成16，而系统是8核的，文档中也提到：

concurrent thread number, default 60% available processors, suggest not to exceed Runtime.getRuntime().availableProcessors()

说明这是一个重CPU的服务，修改为4后，CPU基本上就小于10了，至于原因只能说canal解析binlog的线程开的太多了，而且这个参数是instance级别的，所以非常耗CPU，至于这个参数修改后会不会有什么影响，后面再说。

最后也想说一句，在某些场景下，负载高和CPU利用率高，并不代表系统有问题。

数据库

文章转载自虞大胆的叽叽喳喳，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

canal一个小调整，负载直接下来了

评论