PostgreSQL 垃圾回收参数优化 - maintenance_work_mem , autovacuum_work_mem

开源喵 2021-06-23

1292

9.4以前的版本，垃圾回收相关的内存参数maintenance_work_mem，9.4以及以后的版本为autovacuum_work_mem，如果没有设置autovacuum_work_mem，则使用maintenance_work_mem的设置。

这个参数设置的是内存大小有什么用呢？

这部分内存被用于记录垃圾tupleid，vacuum进程在进行表扫描时，当扫描到的垃圾记录ID占满了整个内存（autovacuum_work_mem或maintenance_work_mem），那么会停止扫描表，开始INDEX的扫描。

扫描INDEX时，清理索引中的哪些tuple，实际上是从刚才内存中记录的这些tupleid来进行匹配。

当所有索引都扫描并清理了一遍后，继续从刚才的位点开始扫描表。

过程如下：

1、palloc autovacuum_work_mem memory

2、scan table,

3、dead tuple's tupleid write to autovacuum_work_mem

4、when autovacuum_work_mem full (with dead tuples can vacuum)

5、record table scan offset.

6、scan indexs

7、vacuum index's dead tuple (these: index item's ctid in autovacuum_work_mem)

8、scan indexs end

9、continue scan table with prev's offset

...

显然，如果垃圾回收时autovacuum_work_mem太小，INDEX会被多次扫描，浪费资源，时间。

palloc autovacuum_work_mem memory 这部分内存是使用时分配，并不是直接全部使用掉maintenance_work_mem或autovacuum_work_mem设置的内存，PG代码中做了优化限制:

对于小表，可能申请少量内存，算法请参考如下代码（对于小表，申请的内存数会是保障可记录下整表的tupleid的内存数（当maintenance_work_mem或autovacuum_work_mem设置的内存大于这个值时））。

我已经在如下代码中进行了标注：

* MaxHeapTuplesPerPage is an upper bound on the number of tuples that can

* fit on one heap page. (Note that indexes could have more, because they

* use a smaller tuple header.) We arrive at the divisor because each tuple

* must be maxaligned, and it must have an associated item pointer.

* Note: with HOT, there could theoretically be more line pointers (not actual

* tuples) than this on a heap page. However we constrain the number of line

* pointers to this anyway, to avoid excessive line-pointer bloat and not

* require increases in the size of work arrays.

#define MaxHeapTuplesPerPage \

((int) ((BLCKSZ - SizeOfPageHeaderData) \

(MAXALIGN(SizeofHeapTupleHeader) + sizeof(ItemIdData))))

* Guesstimation of number of dead tuples per page. This is used to

* provide an upper limit to memory allocated when vacuuming small

* tables.

#define LAZY_ALLOC_TUPLES MaxHeapTuplesPerPage

* lazy_space_alloc - space allocation decisions for lazy vacuum

* See the comments at the head of this file for rationale.

static void

lazy_space_alloc(LVRelStats *vacrelstats, BlockNumber relblocks)

{

long maxtuples;

int vac_work_mem = IsAutoVacuumWorkerProcess() &&

autovacuum_work_mem != -1 ?

autovacuum_work_mem : maintenance_work_mem;

if (vacrelstats->hasindex)

{

maxtuples = (vac_work_mem * 1024L) sizeof(ItemPointerData);

maxtuples = Min(maxtuples, INT_MAX);

maxtuples = Min(maxtuples, MaxAllocSize sizeof(ItemPointerData));

* curious coding here to ensure the multiplication can't overflow */

这里保证了maintenance_work_mem或autovacuum_work_mem不会直接被使用光，

如果是小表，会palloc少量memory。

if ((BlockNumber) (maxtuples LAZY_ALLOC_TUPLES) > relblocks)

maxtuples = relblocks * LAZY_ALLOC_TUPLES;

* stay sane if small maintenance_work_mem */

maxtuples = Max(maxtuples, MaxHeapTuplesPerPage);

}

else

{

maxtuples = MaxHeapTuplesPerPage;

}

vacrelstats->num_dead_tuples = 0;

vacrelstats->max_dead_tuples = (int) maxtuples;

vacrelstats->dead_tuples = (ItemPointer)

palloc(maxtuples * sizeof(ItemPointerData));

}

maintenance_work_mem这个内存还有一个用途，创建索引时，maintenance_work_mem控制系统在构建索引时将使用的最大内存量。为了构建一个B树索引，必须对输入的数据进行排序，如果要排序的数据在maintenance_work_mem设定的内存中放置不下，它将会溢出到磁盘中。

例子

如何计算适合的内存大小

postgres=# show autovacuum_work_mem ;

autovacuum_work_mem

---------------------

1GB

(1 row)

postgres=# show maintenance_work_mem ;

maintenance_work_mem

----------------------

1GB

(1 row)

也就是说，最多有1GB的内存，用于记录一次vacuum时，一次性可存储的垃圾tuple的tupleid。

tupleid为6字节长度。

* ItemPointer:

* This is a pointer to an item within a disk page of a known file

* (for example, a cross-link from an index to its parent table).

* blkid tells us which block, posid tells us which entry in the linp

* (ItemIdData) array we want.

* Note: because there is an item pointer in each tuple header and index

* tuple header on disk, it's very important not to waste space with

* structure padding bytes. The struct is designed to be six bytes long

* (it contains three int16 fields) but a few compilers will pad it to

* eight bytes unless coerced. We apply appropriate persuasion where

* possible. If your compiler can't be made to play along, you'll waste

* lots of space.

typedef struct ItemPointerData

{

BlockIdData ip_blkid;

OffsetNumber ip_posid;

}

1G可存储1.7亿条dead tuple的tupleid。

postgres=# select 1024*1024*1024/6;

?column?

-----------

178956970

(1 row)

而自动垃圾回收是在什么条件下触发的呢？

src/backend/postmaster/autovacuum.c

* A table needs to be vacuumed if the number of dead tuples exceeds a

* threshold. This threshold is calculated as

* threshold = vac_base_thresh + vac_scale_factor * reltuples

vac_base_thresh: autovacuum_vacuum_threshold

vac_scale_factor: autovacuum_vacuum_scale_factor

postgres=# show autovacuum_vacuum_threshold ;

autovacuum_vacuum_threshold

-----------------------------

(1 row)

postgres=# show autovacuum_vacuum_scale_factor ;

autovacuum_vacuum_scale_factor

--------------------------------

0.2

(1 row)

以上设置，表示当垃圾记录数达到50+表大小乘以0.2时，会触发垃圾回收。

可以看成，垃圾记录约等于表大小的20%，触发垃圾回收。

那么1G能存下多大表的垃圾呢？约8.9亿条记录的表。

postgres=# select 1024*1024*1024/6/0.2;

?column?

--------------------

894784850

(1 row)

压力测试例子

postgres=# show log_autovacuum_min_duration ;

log_autovacuum_min_duration

-----------------------------

(1 row)

create table test(id int primary key, c1 int, c2 int, c3 int);

create index idx_test_1 on test (c1);

create index idx_test_2 on test (c2);

create index idx_test_3 on test (c3);

vi test.sql

\set id random(1,10000000)

insert into test values (:id,random()*100, random()*100,random()*100) on conflict (id) do update set c1=excluded.c1, c2=excluded.c2,c3=excluded.c3;

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 1200

垃圾回收记录

2019-02-26 22:51:50.323 CST,,,35632,,5c755284.8b30,1,,2019-02-26 22:51:48 CST,36/22,0,LOG,00000,"automatic vacuum of table ""postgres.public.test"": index scans: 1

pages: 0 removed, 6312 remain, 2 skipped due to pins, 0 skipped frozen

tuples: 4631 removed, 1158251 remain, 1523 are dead but not yet removable, oldest xmin: 1262982800

buffer usage: 39523 hits, 1 misses, 1 dirtied

avg read rate: 0.004 MB/s, avg write rate: 0.004 MB/s

system usage: CPU: user: 1.66 s, system: 0.10 s, elapsed: 1.86 s",,,,,,,,"lazy_vacuum_rel, vacuumlazy.c:407",""

2019-02-26 22:51:50.566 CST,,,35632,,5c755284.8b30,2,,2019-02-26 22:51:48 CST,36/23,1263417553,LOG,00000,"automatic analyze of table ""postgres.public.test"" system usage: CPU: user: 0.16 s, system: 0.04 s, elapsed: 0.24 s",,,,,,,,"do_analyze_rel, analyze.c:722",""

index scans:1 表示垃圾回收的表有索引，并且索引只扫描了一次。

说明autovacuum_work_mem足够大，没有出现vacuum时装不下垃圾dead tuple tupleid的情况。

小结

建议：

1、log_autovacuum_min_duration=0，表示记录所有autovacuum的统计信息。

2、autovacuum_vacuum_scale_factor=0.01，表示1%的垃圾时，触发自动垃圾回收。

3、autovacuum_work_mem，视情况定，确保不出现垃圾回收时多次INDEX SCAN.

4、如果发现垃圾回收统计信息中出现了index scans: 超过1的情况，说明：

4.1、需要增加autovacuum_work_mem，增加多少呢？增加到当前autovacuum_work_mem乘以index scans即可。

4.2、或者调低autovacuum_vacuum_scale_factor到当前值除以index scans即可，让autovacuum尽可能早的进行垃圾回收

postgresql ai

文章转载自开源喵，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

PostgreSQL 垃圾回收参数优化 - maintenance_work_mem , autovacuum_work_mem

评论