PostgreSQL 12 preview - 分区表order by 分区键支持append(ordered scan partition, 避免merge sort)

digoal 2019-04-09

508

作者

digoal

日期

2019-04-09

背景

分区表order by 分区字段时，如果分区是有序的，那么不需要使用merge append （merge sort），避免merge SORT的额外开销。

```
Use Append rather than MergeAppend for scanning ordered partitions.
author Tom Lane tgl@sss.pgh.pa.us
Sat, 6 Apr 2019 07:20:30 +0800 (19:20 -0400)
committer Tom Lane tgl@sss.pgh.pa.us
Sat, 6 Apr 2019 07:20:43 +0800 (19:20 -0400)
commit 959d00e9dbe4cfcf4a63bb655ac2c29a5e579246
tree 4c260f752780d317d1f2ebab8aae553dc4dc8236 tree | snapshot
parent 9f06d79ef831ffa333f908f6d3debdb654292414 commit | diff
Use Append rather than MergeAppend for scanning ordered partitions.

If we need ordered output from a scan of a partitioned table, but
the ordering matches the partition ordering, then we don't need to
use a MergeAppend to combine the pre-ordered per-partition scan
results: a plain Append will produce the same results. This
both saves useless comparison work inside the MergeAppend proper,
and allows us to start returning tuples after istarting up just
the first child node not all of them.

However, all is not peaches and cream, because if some of the
child nodes have high startup costs then there will be big
discontinuities in the tuples-returned-versus-elapsed-time curve.
The planner's cost model cannot handle that (yet, anyway).
If we model the Append's startup cost as being just the first
child's startup cost, we may drastically underestimate the cost
of fetching slightly more tuples than are available from the first
child. Since we've had bad experiences with over-optimistic choices
of "fast start" plans for ORDER BY LIMIT queries, that seems scary.
As a klugy workaround, set the startup cost estimate for an ordered
Append to be the sum of its children's startup costs (as MergeAppend
would). This doesn't really describe reality, but it's less likely
to cause a bad plan choice than an underestimated startup cost would.
In practice, the cases where we really care about this optimization
will have child plans that are IndexScans with zero startup cost,
so that the overly conservative estimate is still just zero.

David Rowley, reviewed by Julien Rouhaud and Antonin Houska

Discussion: https://postgr.es/m/CAKJS1f-hAqhPLRk_RaSFTgYxd=Tz5hA7kQ2h4-DhJufQk8TGuw@mail.gmail.com
```

例子

```
CREATE TABLE cities_ab (
id int primary key,
info text,
crt_time timestamp
) PARTITION BY RANGE (id);

CREATE TABLE cities_ab_1
PARTITION OF cities_ab FOR VALUES FROM (1) TO (100000);

CREATE TABLE cities_ab_2
PARTITION OF cities_ab FOR VALUES FROM (100000) TO (200000);

insert into cities_ab select generate_series(1,199999),md5(random()::text), now();

postgres=# explain select * from cities_ab order by id limit 10;
QUERY PLAN

Limit (cost=0.58..1.10 rows=10 width=45)
-> Append (cost=0.58..10268.57 rows=199999 width=45)
-> Index Scan using cities_ab_1_pkey on cities_ab_1 (cost=0.29..4634.28 rows=99999 width=45)
-> Index Scan using cities_ab_2_pkey on cities_ab_2 (cost=0.29..4634.29 rows=100000 width=45)
(4 rows)
```

PostgreSQL 11，使用了merge append。

```
postgres=# explain select * from cities_ab order by id limit 10;
QUERY PLAN

Limit (cost=0.59..0.97 rows=10 width=44)
-> Merge Append (cost=0.59..7947.55 rows=211310 width=44)
Sort Key: cities_ab_1.id
-> Index Scan using cities_ab_1_pkey on cities_ab_1 (cost=0.29..2917.22 rows=105655 width=44)
-> Index Scan using cities_ab_2_pkey on cities_ab_2 (cost=0.29..2917.22 rows=105655 width=44)
(5 rows)
```

参考

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=959d00e9dbe4cfcf4a63bb655ac2c29a5e579246

文章转载自digoal，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

PostgreSQL 12 preview - 分区表order by 分区键支持append(ordered scan partition, 避免merge sort)

作者

日期

标签

背景

例子

参考

PostgreSQL 许愿链接

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

评论