暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 14 preview - Speed up rechecking if relation needs to be vacuumed or analyze in autovacuum.

digoal 2020-12-09
393

作者

digoal

日期

2020-12-09

标签

PostgreSQL , 加速 , recheck , refresh statistic


背景

PG 14以前, 当收集表是否需要vacuum或analyze时, recheck过程需要刷新staticstic, 如果表很多, 同时vacuum worker工作进程很多, 导致refresh statistics 暴露性能问题.

例如CPU飙高.

PG 14 判断是否需要刷新statistics, 避免每次recheck都需要刷新statistics. 提高性能.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e2ac3fed3b1c3281281eb530c220634030cd8084

```
Speed up rechecking if relation needs to be vacuumed or analyze in autovacuum.
author Fujii Masao fujii@postgresql.org
Tue, 8 Dec 2020 22:59:39 +0800 (23:59 +0900)
committer Fujii Masao fujii@postgresql.org
Tue, 8 Dec 2020 22:59:39 +0800 (23:59 +0900)
commit e2ac3fed3b1c3281281eb530c220634030cd8084
tree 6d3a1fd8138cba277553d3b283957b054639dadb tree | snapshot
parent 4e43ee88c28b725bb63e27609c1a717138fc7f39 commit | diff
Speed up rechecking if relation needs to be vacuumed or analyze in autovacuum.

After autovacuum collects the relations to vacuum or analyze, it rechecks
whether each relation still needs to be vacuumed or analyzed before actually
doing that. Previously this recheck could be a significant overhead
especially when there were a very large number of relations. This was
because each recheck forced the statistics to be refreshed, and the refresh
of the statistics for a very large number of relations could cause heavy
overhead. There was the report that this issue caused autovacuum workers
to have gotten “stuck” in a tight loop of table_recheck_autovac() that
rechecks whether a relation needs to be vacuumed or analyzed.

This commit speeds up the recheck by making autovacuum worker reuse
the previously-read statistics for the recheck if possible. Then if that
"stale" statistics says that a relation still needs to be vacuumed or analyzed,
autovacuum refreshes the statistics and does the recheck again.

The benchmark shows that the more relations exist and autovacuum workers
are running concurrently, the more this change reduces the autovacuum
execution time. For example, when there are 20,000 tables and 10 autovacuum
workers are running, the benchmark showed that the change improved
the performance of autovacuum more than three times. On the other hand,
even when there are only 1000 tables and only a single autovacuum worker
is running, the benchmark didn't show any big performance regression by
the change.

Firstly POC patch was proposed by Jim Nasby. As the result of discussion,
we used Tatsuhito Kasahara's version of the patch using the approach
suggested by Tom Lane.

Reported-by: Jim Nasby
Author: Tatsuhito Kasahara
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/3FC6C2F2-8A47-44C0-B997-28830B5716D0@amazon.com
```

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论