暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

梧桐数据库(WuTongDB):并发报错too many clients already处理

原创 2025-02-26
243

最近偶尔会接到客户反馈业务跑批过程中报错FALTA: sorry, too many clients already,本文结合实际情况对问题进行说明并提供一些处理建议。

【场景一】连接数超出上限

梧桐数据库的最大连接数受集群级guc参数max_connections限制,当业务并发数超过上限时,梧桐数据库不再允许新的请求并报错。这种情况下可以考虑适当调整max_connections参数提高数据库并发上限,或者降低业务并发。
实验环境降低max_connections为128,脚本内并发执行语句,可以看到业务并发超过限制后语句报错。

wutongdb=> show max_connections;
 max_connections 
-----------------
 128
(1 row)


[root@9acd57b29a26 ~]$ for i in {1..129}
> do
>   psql -p 7000 -d postgres -c "select pg_sleep(5)" &
> done
psql: error: FATAL:  sorry, too many clients already
[129]+ Exit 1                  psql -p 7000 -d postgres -c "select pg_sleep(5);"

【场景二】计算节点连接数配置不合理

为了在查询执行期间实现最大的并行度,梧桐数据库将查询计划进行切分。Slice是计划中可以独立进行处理的部分,上一层slice会读取下一层各个slice并进行计算。每一个slice可以看作一个单独的连接,所以计算节点的连接数一般大于实际的活跃会话数。
image.png
以一个简单查询为例,计划切分后被分发到4个vseg上进行计算,所以计算节点实际的连接数是1(slice数)X4(vseg数)。

wutongdb=> create table t1(a int, b int) distributed by (a); CREATE TABLE wutongdb=> explain select * from t1; QUERY PLAN ------------------------------------------------------------------------------------ Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..1661.50 rows=109100 width=8) -> Seq Scan on t1 (cost=0.00..297.75 rows=27275 width=8) Optimizer: Postgres-based planner NewExecutor: ON (4 rows)

修改测试脚本,降低并发数为40,可以看到语句仍报错,但报错的节点变为了计算节点。在这种情况下,一般需要根据实际业务情况调大计算节点的max_segments数。

[root@9acd57b29a26 ~]$ for i in {1..40}
> do
>   psql -p 7000 -d postgres -c "select count(*), pg_sleep(5) from t1;" &
> done
ERROR:  error from proxy dispatcher 0-127.0.0.1:7002: FATAL:  sorry, too many clients already (cdbproxy.cc:272)
[40]+  Exit 1                  psql -p 7000 -d postgres -c "select count(*), pg_sleep(5) from t1;"

【场景三】复杂SQL

除了上述两种情况,有时SQL过于复杂或计划不合理也会导致资源消耗过大而连接超限,此时需要排查SQL占用资源过多是否合理并进行优化。
下面以一个join+agg的情况举例说明,可以看到当关闭ORCA优化器后语句生成的计划slice更多,需要占用更多的资源。

wutongdb=> explain select *,pg_sleep(1) from t1,(select max(b) b from t1) s2 where t1.b=s2.b; QUERY PLAN ------------------------------------------------------------------------------------------------ Hash Join (cost=0.00..862.00 rows=1 width=12) Hash Cond: (t1.b = (max(t1_1.b))) -> Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..431.00 rows=1 width=8) -> Seq Scan on t1 (cost=0.00..431.00 rows=1 width=8) -> Hash (cost=431.00..431.00 rows=1 width=4) -> Aggregate (cost=0.00..431.00 rows=1 width=4) -> Gather Motion 4:1 (slice2; segments: 4) (cost=0.00..431.00 rows=1 width=4) -> Seq Scan on t1 t1_1 (cost=0.00..431.00 rows=1 width=4) Optimizer: GPORCA (9 rows) wutongdb=> set optimizer=off; SET wutongdb=> explain select *,pg_sleep(1) from t1,(select max(b) b from t1) s2 where t1.b=s2.b; QUERY PLAN ---------------------------------------------------------------------------------------------------------- Gather Motion 4:1 (slice1; segments: 4) (cost=1.08..2.17 rows=3 width=16) -> Hash Join (cost=1.08..2.13 rows=1 width=16) Hash Cond: (t1.b = (max(t1_1.b))) -> Redistribute Motion 4:4 (slice2; segments: 4) (cost=0.00..1.03 rows=1 width=8) Hash Key: t1.b -> Seq Scan on t1 (cost=0.00..1.01 rows=1 width=8) -> Hash (cost=1.07..1.07 rows=1 width=4) -> Redistribute Motion 1:4 (slice3; segments: 1) (cost=1.03..1.07 rows=1 width=4) Hash Key: (max(t1_1.b)) -> Aggregate (cost=1.03..1.04 rows=1 width=4) -> Gather Motion 4:1 (slice4; segments: 4) (cost=0.00..1.03 rows=1 width=4) -> Seq Scan on t1 t1_1 (cost=0.00..1.01 rows=1 width=4) Optimizer: Postgres-based planner (13 rows)

从实际测试结果也可以明显地看到同样并发数下计划更差的SQL执行会报错,而计划更优的语句可以正确执行。这种情况下需要结合实际情况进行SQL调优。

[root@9acd57b29a26 ~]$ for i in {1..15}
> do
>   psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;" &
> done
[1]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[2]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[3]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[4]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[5]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[6]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[7]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[8]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[9]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[10]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[11]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[12]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[13]   Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[14]-  Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[15]+  Done                    psql -p 7000 -d postgres -c "set optimizer=on;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"

[root@9acd57b29a26 ~]$ for i in {1..15}
> do
>   psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;" &
> done
...
ERROR:  error from proxy dispatcher 0-172.17.0.2:7002: FATAL:  sorry, too many clients already (cdbproxy.cc:272)
...
[1]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[2]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[3]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[4]   Exit 1                  psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[5]   Exit 1                  psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[6]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[7]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[8]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[9]   Exit 1                  psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[10]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[11]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[12]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[13]   Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[14]-  Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
[15]+  Done                    psql -p 7000 -d postgres -c "set optimizer=off;select *, pg_sleep(5) from t1,(select max(b) b from t1) s2 where t1.b=s2.b;"
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论