PostgreSQL select x from tbl where c1 in () order by c2; 优化 - merge sort , top-N heapsort , quick sort

digoal 2020-09-28

935

作者

digoal

日期

2020-09-28

背景

按A字段范围条件过滤, 按B字段排序返回. 由于索引组织的原因, 无法直接通过索引过滤并有序返回.

《PostgreSQL 优化case - where A字段范围 order by B字段排序 limit x》

但是PG支持merge sort, top-N heapsort, quicksort等技术. 可以较好的优化.

实际上在这两个并行计算的场景中, 也有类似情况:

1、greenplum 多节点并行计算, 返回结果需要排序, 实际上也是对每个计算单元已经排序的结果后的再排序, 可以采用merge sort输出.

2、postgresql 自从9.6开始也支持了内置并行计算, 当多个worker返回了有序的结果时, 基于有序结果再汇总sort输出, 也采用了类似greenplum的merge sort返回. - parallel append, merge sort.

例子

PostgreSQL 12为例:

10亿记录, 30万个分组, 选中N个分组, 并按score排序返回.

```
create table a (id int8 primary key, gid int, score int, info text);
insert into a select generate_series(1,1000000000), random()300000, random()10000, md5(random()::text);
create index idx_a_1 on a (gid,score); -- 当前版本, 可能 gid单列反而更好. 目前PG的版本优化器可能没有用到score的有序.

show work_mem;

select * from a where gid in (1,2,3,4) order by score limit 100;

select * from a where gid in (1,2,3,4) order by score;
```

耗时与执行计划如下, 从执行计划可以看出PG已经非常智能, 根据排序集合大小, 返回记录数, 根据参数自动选择了最有的排序算法.

```
postgres=> explain analyze select * from a where gid in (1,2,3,4) order by score limit 100;
QUERY PLAN

Limit (cost=15949.49..15949.74 rows=100 width=49) (actual time=12.484..12.507 rows=100 loops=1)
-> Sort (cost=15949.49..15983.91 rows=13769 width=49) (actual time=12.482..12.491 rows=100 loops=1)
Sort Key: score
Sort Method: top-N heapsort Memory: 39kB
-> Index Scan using idx_a_1 on a (cost=0.57..15423.24 rows=13769 width=49) (actual time=0.025..10.773 rows=13242 loops=1)
Index Cond: (gid = ANY ('{1,2,3,4}'::integer[]))
Planning Time: 0.083 ms
Execution Time: 12.543 ms
(8 rows)

postgres=> explain analyze select * from a where gid in (1,2,3,4) order by score;
QUERY PLAN

Sort (cost=16369.80..16404.23 rows=13769 width=49) (actual time=14.017..15.192 rows=13242 loops=1)
Sort Key: score
Sort Method: quicksort Memory: 2247kB
-> Index Scan using idx_a_1 on a (cost=0.57..15423.24 rows=13769 width=49) (actual time=0.017..10.342 rows=13242 loops=1)
Index Cond: (gid = ANY ('{1,2,3,4}'::integer[]))
Planning Time: 0.059 ms
Execution Time: 16.318 ms
(7 rows)
```

采用union all也可以消除score显示排序. 但是子句必须排序.

```
set enable_sort =on;

select * from(
select * from (select * from a where gid=1 order by score) t -- 子句必须order by, 否则不能消除sort
union all
select * from (select * from a where gid=2 order by score) t
union all
select * from (select * from a where gid=3 order by score) t
union all
select * from (select * from a where gid=4 order by score) t
) t
order by score;

postgres=> explain analyze select * from(
select * from (select * from a where gid=1 order by score) t
union all
select * from (select * from a where gid=2 order by score) t
union all
select * from (select * from a where gid=3 order by score) t
union all
select * from (select * from a where gid=4 order by score) t
) t
order by score;
QUERY PLAN

Gather Merge (cost=9076.08..10652.28 rows=13320 width=49) (actual time=25.432..29.837 rows=13242 loops=1)
Workers Planned: 3
Workers Launched: 0
-> Sort (cost=8076.04..8087.14 rows=4440 width=49) (actual time=22.284..23.889 rows=13242 loops=1)
Sort Key: a.score
Sort Method: quicksort Memory: 2247kB
-> Parallel Append (cost=0.57..7807.06 rows=4440 width=49) (actual time=0.035..17.015 rows=13242 loops=1)
-> Index Scan using idx_a_1 on a (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.014..3.633 rows=3310 loops=1)
Index Cond: (gid = 1)
-> Index Scan using idx_a_1 on a a_1 (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.014..3.613 rows=3372 loops=1)
Index Cond: (gid = 2)
-> Index Scan using idx_a_1 on a a_2 (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.012..3.572 rows=3304 loops=1)
Index Cond: (gid = 3)
-> Index Scan using idx_a_1 on a a_3 (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.034..3.711 rows=3256 loops=1)
Index Cond: (gid = 4)
Planning Time: 0.306 ms
Execution Time: 31.508 ms
(17 rows)
```

子句未使用sort时, 则不支持消除sort.

```
explain analyze select * from(
select * from a where gid=1
union all
select * from a where gid=2
union all
select * from a where gid=3
union all
select * from a where gid=4
) t
order by score;

postgres=> explain analyze select * from(
postgres(> select * from a where gid=1
postgres(> union all
postgres(> select * from a where gid=2
postgres(> union all
postgres(> select * from a where gid=3
postgres(> union all
postgres(> select * from a where gid=4
postgres(> ) t
postgres-> order by score;
QUERY PLAN

Sort (cost=16585.04..16619.46 rows=13768 width=49) (actual time=15.803..16.915 rows=13242 loops=1)
Sort Key: a.score
Sort Method: quicksort Memory: 2247kB
-> Append (cost=0.57..15638.55 rows=13768 width=49) (actual time=0.027..12.116 rows=13242 loops=1)
-> Index Scan using idx_a_1 on a (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.026..2.705 rows=3310 loops=1)
Index Cond: (gid = 1)
-> Index Scan using idx_a_1 on a a_1 (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.008..2.591 rows=3372 loops=1)
Index Cond: (gid = 2)
-> Index Scan using idx_a_1 on a a_2 (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.007..2.504 rows=3304 loops=1)
Index Cond: (gid = 3)
-> Index Scan using idx_a_1 on a a_3 (cost=0.57..3858.01 rows=3442 width=49) (actual time=0.007..2.447 rows=3256 loops=1)
Index Cond: (gid = 4)
Planning Time: 0.189 ms
Execution Time: 18.055 ms
(14 rows)
```

如果返回的结果特别大, 只需要求TOP N, 或者希望尽快有返回, 可以采用游标返回, 响应速度特别快.

``` postgres=> \set FETCH_COUNT 10

create index idx_a_2 on a (mod(gid,10),score);

postgres=> explain select * from ( postgres(> select * from (select * from c where mod(gid,10)=1 order by score) t postgres(> union all postgres(> select * from (select * from c where mod(gid,10)=2 order by score) t postgres(> ) t order by score; QUERY PLAN

Merge Append (cost=1.16..9503278.56 rows=10000000 width=49) Sort Key: c.score -> Index Scan using idx_c_1 on c (cost=0.57..4651639.28 rows=5000000 width=49) Index Cond: (mod(gid, 10) = 1) -> Index Scan using idx_c_1 on c c_1 (cost=0.57..4651639.28 rows=5000000 width=49) Index Cond: (mod(gid, 10) = 2) (6 rows)

开启并行计算, 依旧可以消除sort

explain select * from ( select * from (select * from c where mod(gid,10)=1 order by score) t union all select * from (select * from c where mod(gid,10)=2 order by score) t ) t order by score; QUERY PLAN

Gather Merge (cost=5348440.28..6320730.23 rows=8333332 width=49) Workers Planned: 2 -> Sort (cost=5347440.26..5357856.92 rows=4166666 width=49) Sort Key: c.score -> Parallel Append (cost=0.57..4722472.61 rows=4166666 width=49) -> Index Scan using idx_c_1 on c (cost=0.57..4651639.28 rows=5000000 width=49) Index Cond: (mod(gid, 10) = 1) -> Index Scan using idx_c_1 on c c_1 (cost=0.57..4651639.28 rows=5000000 width=49) Index Cond: (mod(gid, 10) = 2) (9 rows) ```

如上: 当同时出现merge append、gather merge和sort时, 采用的就是merge sort方法.

work_mem = 4MB

enable_sort = on

enable_incremental_sort = on

enable_parallel_append = on

```

小结

1、PostgreSQL优化器还有继续改进的空间, 使用union all可以间接触发数据库的merge sort功能, 但是必须在每个子句中显示的给定order by.

2、PostgreSQL 13版本引入了incremental sort的功能, 针对在执行步骤中已经排序的结果集, 在进入下一步后, 在某些情况下可以利用之前的有序性, 对未来的结果进行更有效的排序.

```
select * from a order by gid,c1;

如果有index (gid), 那么以上sql就能用incremental sort来排序

EXPLAIN SELECT * FROM tenk1 ORDER BY four, ten LIMIT 100;
QUERY PLAN
------------------------------------------------------------------------------------------------------
Limit (cost=521.06..538.05 rows=100 width=244)
-> Incremental Sort (cost=521.06..2220.95 rows=10000 width=244)
Sort Key: four, ten
Presorted Key: four
-> Index Scan using index_tenk1_on_four on tenk1 (cost=0.29..1510.08 rows=10000 width=244)
```