暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 12 - btree index 增强(duplicate key sort by ctid)

digoal 2019-11-28
609

作者

digoal

日期

2019-11-28

标签

PostgreSQL , btree index , sort by ctid , duplicate key


背景

Improve performance and space utilization of btree indexes with many duplicates (Peter Geoghegan, Heikki Linnakangas)

Previously, duplicate index entries were stored unordered within their duplicate groups. This caused overhead during index inserts, wasted space due to excessive page splits, and it reduced VACUUM's ability to recycle entire pages. Duplicate index entries are now sorted in heap-storage order.

Indexes pg_upgrade'd from previous releases will not have these benefits.

PG 12 btree索引存储结构增强:

1、duplicate key值按ctid排序存储,减少索引页split。原来是按duplicate group无序存储。

2、vacuum时,先scan table,存储垃圾tuple ctid,存满vacuum memory(maintenance mem)后,接下来scan index一遍,回收垃圾index tuple。所以如果按ctid有序的情况下,如果回收的是整个duplicate keys,vacuum产生的wal更少(涉及的index page更少))。

举证

pg 11

```
postgres=# create table test_bt (id int, c1 int);
CREATE TABLE
postgres=# create index idx_test_bt_1 on test_bt (c1);
CREATE INDEX
postgres=# insert into test_bt select generate_series(1,10000000), random()*10000;
INSERT 0 10000000
Time: 44992.592 ms (00:44.993)
postgres=# \di+ idx_test_bt_1 ;
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+---------------+-------+----------+---------+--------+-------------
public | idx_test_bt_1 | index | postgres | test_bt | 281 MB |
(1 row)

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_bt where c1=2;
QUERY PLAN


Index Scan using idx_test_bt_1 on public.test_bt (cost=0.38..849.01 rows=50000 width=8) (actual time=0.034..2.011 rows=1062 loops=1)
Output: id, c1
Index Cond: (test_bt.c1 = 2)
Buffers: shared hit=1059
Planning Time: 0.132 ms
Execution Time: 2.083 ms
(6 rows)
```

pg 12

```
postgres=# create table test_bt (id int, c1 int);
CREATE TABLE
postgres=# create index idx_test_bt_1 on test_bt (c1);
CREATE INDEX
postgres=# insert into test_bt select generate_series(1,10000000), random()*10000;
INSERT 0 10000000
Time: 36449.404 ms (00:36.449)

postgres=# \di+ idx_test_bt_1
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+---------------+-------+----------+---------+--------+-------------
public | idx_test_bt_1 | index | postgres | test_bt | 235 MB |
(1 row)

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_bt where c1=2;
QUERY PLAN


Index Scan using idx_test_bt_1 on public.test_bt (cost=0.43..39395.43 rows=50000 width=8) (actual time=0.033..1.141 rows=1002 loops=1)
Output: id, c1
Index Cond: (test_bt.c1 = 2)
Buffers: shared hit=998
Planning Time: 0.074 ms
Execution Time: 1.226 ms
(6 rows)
```

参考

https://www.postgresql.org/docs/12/release-12.html

https://www.postgresql.org/about/featurematrix/

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论