前言
许久之前,写了三篇关于 vacuum 内核原理的文章
今天在上课的过程中,有学员问我,TOAST 表是如何清理的?其实 vacuum 不仅会清理常规的表,还会清理物化视图以及 TOAST。这一期,让我们深入 TOAST 的清理机制,查漏补缺。
何如
简单温顾下 TOAST,PostgreSQL 不允许一行数据跨页存储,那么对于超长的行数据就会启动 TOAST,将大的字段压缩或切片成多个物理行存到另一张系统表中。那么如果产生的死元组不涉及到 TOAST,PostgreSQL 是如何清理的?涉及到的话又怎样?
让我们简单试验一下:
postgres=# create table toast_demo ( id int primary key, content text );
CREATE TABLE
postgres=# insert into toast_demo values (1, repeat('x',10000));
INSERT 0 1
postgres=# select * from heap_page_items(get_raw_page('toast_demo', 0));^C
postgres=# SELECT oid::regclass,
reltoastrelid::regclass,
pg_relation_size(reltoastrelid) AS toast_size
FROM pg_class
WHERE relkind = 'r'
AND reltoastrelid <> 0 and relname = 'toast_demo' order by 3 desc;
oid | reltoastrelid | toast_size
------------+-------------------------+------------
toast_demo | pg_toast.pg_toast_33697 | 0
(1 row)
此时还没有触发 TOAST,让我们插入一个更长点的值:
postgres=# with dummy_string as
( select string_agg (md5(random()::text),'') as dummy
from generate_series(1,5000) )
insert into toast_demo
select 2
, dummy_string.dummy
from dummy_string;
INSERT 0 1
postgres=# SELECT oid::regclass,
reltoastrelid::regclass,
pg_relation_size(reltoastrelid) AS toast_size
FROM pg_class
WHERE relkind = 'r'
AND reltoastrelid <> 0 and relname = 'toast_demo' order by 3 desc;
oid | reltoastrelid | toast_size
------------+-------------------------+------------
toast_demo | pg_toast.pg_toast_33697 | 172032
(1 row)
现在让我们在不更新 content 字段的前提下,更新一下数据
postgres=# update toast_demo set id = 3 where id = 2;
UPDATE 1
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('toast_demo', 0));
t_xmin | t_xmax
--------+--------
851536 | 0
851537 | 851538
851538 | 0
(3 rows)
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('pg_toast.pg_toast_33697', 0));
t_xmin | t_xmax
--------+--------
851537 | 0
851537 | 0
851537 | 0
851537 | 0
(4 rows)
可以很明显的看到,如果更新的字段不涉及 TOAST 字段,那么是不会去更新 TOAST 表的,其实这和 TOAST 的设计理念相同:
如果数据存在 TOAST 表中,则先调用函数 toast_fetch_datum 从 TOAST 表中获取该数据的片段来重组数据。如果是经过压缩的还需要先解压再返回数据 如果数据没有行外存储但是经过压缩的,则解压后返回数据 如果需要的数据不需要访问 TOAST,则直接返回
所以 PostgreSQL 将二者进行了分开,你可以选择直接对 TOAST 表进行 vacuum,vacuum full之类的操作。
postgres=# update toast_demo set content = 'hello' where id = 3;
UPDATE 1
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('pg_toast.pg_toast_33697', 0));
t_xmin | t_xmax
--------+--------
851537 | 851539
851537 | 851539
851537 | 851539
851537 | 851539
(4 rows)
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('toast_demo', 0));
t_xmin | t_xmax
--------+--------
851536 | 0
851537 | 851538
851538 | 851539
851539 | 0
(4 rows)
postgres=# vacuum pg_toast.pg_toast_33697;
VACUUM
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('pg_toast.pg_toast_33697', 0));
ERROR: block number 0 is out of range for relation "pg_toast_33697"
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('toast_demo', 0));
t_xmin | t_xmax
--------+--------
851536 | 0
851537 | 851538 ---主表死元组还在
851538 | 851539 ---主表死元组还在
851539 | 0
(4 rows)
可以看到,对 TOAST 表进行了 vacuum 之后,主表中死元组数据还在。
postgres=# vacuum toast_demo;
VACUUM
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('toast_demo', 0));
t_xmin | t_xmax
--------+--------
851536 | 0
|
|
851539 | 0
(4 rows)
不过需要注意的是,如果 vacuum 主表,TOAST 中也存在死元组的话,也会自动对 TOAST 进行清理
postgres=# update toast_demo set id = 3,content = 'hello' where id = 2;
UPDATE 1
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('toast_demo', 0));
t_xmin | t_xmax
--------+--------
851543 | 0
851544 | 851545
851545 | 0
(3 rows)
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('pg_toast.pg_toast_33705', 0));
t_xmin | t_xmax
--------+--------
851544 | 851545
851544 | 851545
851544 | 851545
851544 | 851545
(4 rows)
postgres=# vacuum toast_demo ;
VACUUM
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('pg_toast.pg_toast_33705', 0));
ERROR: block number 0 is out of range for relation "pg_toast_33705"
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('toast_demo', 0));
t_xmin | t_xmax
--------+--------
851543 | 0
|
851545 | 0
(3 rows)
基于此,在 14 之后,引入了一个优化
This option controls if toast tables associated with a relation are vacuumed or not when running a manual VACUUM. It was already possible to trigger a manual VACUUM on a toast relation without processing its main relation, but a manual vacuum on a main relation always forced a vacuum on its toast table. This is useful in scenarios where the level of bloat or transaction age of the main and toast relations differs a lot.
该选项控制在运行手动 VACUUM 时是否对与关系相关的 TOAST 表进行清理。以前已经可以在不处理主关系的情况下触发对 TOAST 的手动 VACUUM,但对主关系的手动清理总是会强制对 TOAST 表进行清理。这在主关系和 TOAST 的膨胀程度或事务年龄相差很大的情况下非常有用。
当主表和 TOAST 的膨胀程度不同时,或者年龄相差很大,没有必要去扫描 TOAST 的时候,就可以指定 PROCESS_TOAST 为 FALSE:
postgres=# vacuum (PROCESS_TOAST FALSE) toast_demo ;
VACUUM
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('pg_toast.pg_toast_33729', 0));
t_xmin | t_xmax
--------+--------
851563 | 851565 ---TOAST并未处理
851563 | 851565
851563 | 851565
851563 | 851565
(4 rows)
postgres=# select t_xmin,t_xmax from heap_page_items(get_raw_page('toast_demo', 0));
t_xmin | t_xmax
--------+--------
851562 | 0
|
851565 | 0
(3 rows)
autovacuum如何处理
前面分析了手工 vacuum,那么 autovacuum 是如何工作的?按照以往现象,autovacuum 肯定是会去处理 TOAST 的 (默认情况下),不然早就炸锅了。
简单走读了一下代码流程,在 autovacuum 的工作流程中,其会遍历两次 pg_class,构建要扫描的对象列表——表、物化视图和 TOAST (pg_class.relkind),这个期间会跳过临时表,同时还会参考表级参数 (pg_class.reloptions),比如关闭了表级自动清理。
/*
* Scan pg_class to determine which tables to vacuum.
*
* We do this in two passes: on the first one we collect the list of plain
* relations and materialized views, and on the second one we collect
* TOAST tables. The reason for doing the second pass is that during it we
* want to use the main relation's pg_class.reloptions entry if the TOAST
* table does not have any, and we cannot obtain it unless we know
* beforehand what's the main table OID.
*
* We need to check TOAST tables separately because in cases with short,
* wide tables there might be proportionally much more activity in the
* TOAST table than in its parent.
Get list of heap tables & materialized views that need vacuum or analyze,第一遍构建普通表和物化视图,同时扫描过程中,如果发现pg_class.reltoastrelid,那么记录他们的对应关系,其次根据表级参数,判断是否需要进行 vacuum 和 analyze。
if (classForm->relkind != RELKIND_RELATION &&
classForm->relkind != RELKIND_MATVIEW)
continue;
...
/* Fetch reloptions and the pgstat entry for this table */
relopts = extract_autovac_opts(tuple, pg_class_desc);
tabentry = pgstat_fetch_stat_tabentry_ext(classForm->relisshared,
relid);
/* Check if it needs vacuum or analyze */
relation_needs_vacanalyze(relid, relopts, classForm, tabentry,
effective_multixact_freeze_max_age,
&dovacuum, &doanalyze, &wraparound);
/* Relations that need work are added to table_oids */
if (dovacuum || doanalyze)
table_oids = lappend_oid(table_oids, relid);
Get list of TOAST tables that need vacuuming,第二遍构建 TOAST,其他原理类似。
两次遍历之后,最终就得到了所有要进行清理的表列表,添加到 table_oids,里面记录了所有要清理的表和 TOAST
/* Relations that need work are added to table_oids */
if (dovacuum || doanalyze)
table_oids = lappend_oid(table_oids, relid);
不过值得注意的是,TOAST 表是不会进行 analyze 的,原因其实也很简单——因为 TOAST 表总是通过索引访问,所以不会进行分析。
/* ignore analyze for toast tables */
if (dovacuum)
table_oids = lappend_oid(table_oids, relid);
简而言之,TOAST 不会自动随父表一起清理,而是单独构建了一个列表,在 autovacuum 的过程中进行清理,当然你可以类似地去配置 toast.autovacuum_enabled = off,关闭自动清理。
小结
在 14 版本之前,普通的 vacuum 总是会自动清理相对应的 TOAST,14 之后,你可以选择 PROCESS_TOAST,跳过对于 TOAST 的处理。 autovacuum,会通过两次扫描,处理 TOAST,类似地,会基于表级参数 toast.autovacuum_enabled
OK,下一期让我们再聊聊物化视图的清理机制,师母已呆。




