PostgreSQL17优化器改进(10)物化CTE优化

原创墨竹 All China Database Union 2024-10-26

740

PostgreSQL17优化器改进(10)物化CTE优化

从PostgreSQL17beta1发布到现在，不知不觉已经写了9篇关于优化器相关功能的优化的介绍，这篇应该是PG17优化器改进的最后一篇，是介绍CTE相关优化功能。我们先来了解一下CTE的概念。

CTE是Common Table Expressions的缩写，官方正式描述是WITH提供了一种编写辅助语句以用于大型查询的方法，通俗点的说法是使用with语句定义一个查询存在的临时表。在这里说的物化CTE，其实就是指通过指定MATERIALIZED来强制对WITH查询进行单独的计算，避免将with查询合并到父查询中。

在PG17中关于物化CTE优化的功能有两点

允许优化器访问物化CTE的统计信息
物化CTE优化避免外层排序

一、准备测试用例及数据

本结测试使用的测试用例仍然为第二节课程所创建的测试用例表，如果已经创建测试用例表，则可以跳过本小节。

创建测试表并插入数据。

create table rank1(
id serial primary key,
one integer,
two integer,
three integer,
address text
);

insert into rank1(one,two,three,address) select round(random()*10000),round(random()*2000),round(random()*100000),md5(random()::text) from generate_series(1,1000000) S(i) ;

analyze rank1;

执行analyze命令，收集统计信息

analyze rank1;

二、允许优化器访问物化CTE的统计信息

在这个测试用例中，我们先创建一个CTE，然后在外部查询的in的子查询中使用该CTE查询的值。

EXPLAIN analyze with tmp as MATERIALIZED (select id from rank1)
select * from rank1 where id in (select id from tmp);

PostgreSQL16.3中的执行计划

在PostgreSQL16.3执行测试的SQL语句，执行计划如下：

testdb=# EXPLAIN analyze with tmp as MATERIALIZED (select id from rank1)
testdb-# select * from rank1 where id in (select id from tmp);
                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=42810.43..44207.25 rows=500000 width=49) (actual time=628.348..3482.349 rows=1000000 loops=1)
   CTE tmp
     ->  Seq Scan on rank1 rank1_1  (cost=0.00..20310.00 rows=1000000 width=4) (actual time=0.008..91.096 rows=1000000 loops=1)
   ->  HashAggregate  (cost=22500.00..22502.00 rows=200 width=4) (actual time=628.319..1122.693 rows=1000000 loops=1)
         Group Key: tmp.id
         Batches: 21  Memory Usage: 10305kB  Disk Usage: 23512kB
         ->  CTE Scan on tmp  (cost=0.00..20000.00 rows=1000000 width=4) (actual time=0.009..295.690 rows=1000000 loops=1)
   ->  Index Scan using rank1_pkey on rank1  (cost=0.42..8.28 rows=1 width=49) (actual time=0.002..0.002 rows=1 loops=1000000)
         Index Cond: (id = tmp.id)
 Planning Time: 0.209 ms
 Execution Time: 3511.799 ms
(11 rows)

Time: 3512.665 ms (00:03.513)

在上述的执行计划中，HashAggregate节点之前通过id对CTE进行分组，但是rank1表的主键是id列，因此分组后的行数应该是100万，而不是估计的200行，由于行数估算的不准确导致CTE和外部查询的连接方式选择了错误的连接方式即嵌套循环连接。

PostgreSQL17.0中的执行计划

在PostgreSQL17执行测试的SQL语句，执行计划如下：

testdb=# EXPLAIN analyze with tmp as MATERIALIZED (select id from rank1)
testdb-# select * from rank1 where id in (select id from tmp);
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Hash Semi Join  (cost=56717.00..114216.00 rows=1000000 width=49) (actual time=512.994..1187.660 rows=1000000 loops=1)
   Hash Cond: (rank1.id = tmp.id)
   CTE tmp
     ->  Seq Scan on rank1 rank1_1  (cost=0.00..20310.00 rows=1000000 width=4) (actual time=0.035..201.667 rows=1000000 loops=1)
   ->  Seq Scan on rank1  (cost=0.00..20310.00 rows=1000000 width=49) (actual time=0.022..67.460 rows=1000000 loops=1)
   ->  Hash  (cost=20000.00..20000.00 rows=1000000 width=4) (actual time=512.044..512.046 rows=1000000 loops=1)
         Buckets: 262144  Batches: 8  Memory Usage: 6446kB
         ->  CTE Scan on tmp  (cost=0.00..20000.00 rows=1000000 width=4) (actual time=0.039..373.081 rows=1000000 loops=1)
 Planning Time: 0.909 ms
 Execution Time: 1213.739 ms
(10 rows)

Time: 1216.521 ms (00:01.217)

我们再来详细的看看PostgreSQL17的执行计划，由于在PG17中优化器可以访问物化CTE的的列的一些统计信息即CTE结果的估算宽度和行数等信息，这就导致优化器对于rows估算更精确，从而选择了正确的连接方式，从以前的Nested Loop 调整为了现在的Hash Semi Join.

三、物化CTE优化避免外层排序

创建索引，如果已经创建索引则跳过。

create index idx_rank1_one on rank1(one);

在这里，场景有点特殊，在with子句内部我们需要对数据进行一次排序，在父查询中再一次指定了排序键，两次的排序方式和排序键需要一致。

explain analyze WITH tmp AS MATERIALIZED (
    SELECT * FROM rank1 ORDER BY one
)
SELECT * FROM tmp ORDER BY one;

PostgreSQL16.3中的执行计划

在PostgreSQL16.3执行测试的SQL语句，执行计划如下：

testdb=# explain analyze WITH tmp AS MATERIALIZED (
testdb(#     SELECT * FROM rank1 ORDER BY one
testdb(# )
testdb-# SELECT * FROM tmp ORDER BY one;
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=240908.16..243408.16 rows=1000000 width=48) (actual time=976.728..1039.110 rows=1000000 loops=1)
   Sort Key: tmp.one
   Sort Method: external merge  Disk: 57744kB
   CTE tmp
     ->  Index Scan using idx_rank1_one on rank1  (cost=0.42..59720.32 rows=1000000 width=49) (actual time=0.017..610.805 rows=1000000 loops=1)
   ->  CTE Scan on tmp  (cost=0.00..20000.00 rows=1000000 width=48) (actual time=0.018..781.014 rows=1000000 loops=1)
 Planning Time: 0.074 ms
 Execution Time: 1071.435 ms
(8 rows)

Time: 1072.262 ms (00:01.072)

通过上述的执行计划可以看到，CTE中使用idx_rank1_one索引对rank1表访问，其实这就意味着该步骤的输出已经有序了，在上述的执行计划中，我们看到还有一个sort节点，很显然优化器不知道CTE的输出是已经是有序的，这就导致了父查询重新进行了一次排序，造成了资源的浪费。

PostgreSQL17.0中的执行计划

在PostgreSQL17执行测试的SQL语句，执行计划如下：

testdb=# explain analyze WITH tmp AS MATERIALIZED (
testdb(#     SELECT * FROM rank1 ORDER BY one
testdb(# )
testdb-# SELECT * FROM tmp ORDER BY one;
                                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on tmp  (cost=59724.41..79724.41 rows=1000000 width=48) (actual time=0.059..773.909 rows=1000000 loops=1)
   CTE tmp
     ->  Index Scan using idx_rank1_one on rank1  (cost=0.42..59724.41 rows=1000000 width=49) (actual time=0.058..576.085 rows=1000000 loops=1)
 Planning Time: 0.065 ms
 Execution Time: 805.694 ms
(5 rows)

Time: 806.207 ms

我们再来看一下PG17中的执行计划，在CTE中使用idx_rank1_one索引对rank1表访问，在这里的输出当然也是有序的。但是PG16中的sort节点已经没有了，显然是对该场景进行了优化，避免一些不要的排序。我想再次需要强调一下这个场景，就是物化CTE的排序列和外部的排序列及排序方式是一致或存在前缀，才可以避免排序。这个场景的测试会在视频中进行延展说明。

总结

在PG17中对于物化CTE功能的两点优化，一个是允许优化器访问物化CTE的统计信息，由于获取到了更加精确的统计信息，因此就避免优化器选择一些错误的执行计划；另外一个是物化CTE避免外层排序，减少了资源的浪费，提升了查询性能。

参考

https://postgrespro.com/blog/pgsql/5970508#commit_f7816aec

https://postgrespro.com/blog/pgsql/5971224

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a65724dfa

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f7816aec2
– / END / –

可以通过下面的方式联系我

微信公众号：@墨竹札记
墨天轮：@墨竹
微信：wshf395062788
PGFans：@墨竹

如果这篇文章为你带来了灵感或启发，就请帮忙点赞、收藏、转发；如果文章中不严谨或者错漏之处，请及时评论指正。非常感谢！

最后修改时间：2025-02-06 10:50:15

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

PostgreSQL17新功能（共17篇）

本合集主要介绍PostgreSQL17新功能改进思路、执行计划的改变及新功能如何使用。