作者
digoal
日期
2019-11-28
标签
PostgreSQL , btree index , sort by ctid , duplicate key
背景
Improve performance and space utilization of btree indexes with many duplicates (Peter Geoghegan, Heikki Linnakangas)
Previously, duplicate index entries were stored unordered within their duplicate groups. This caused overhead during index inserts, wasted space due to excessive page splits, and it reduced VACUUM's ability to recycle entire pages. Duplicate index entries are now sorted in heap-storage order.
Indexes pg_upgrade'd from previous releases will not have these benefits.
PG 12 btree索引存储结构增强:
1、duplicate key值按ctid排序存储,减少索引页split。原来是按duplicate group无序存储。
2、vacuum时,先scan table,存储垃圾tuple ctid,存满vacuum memory(maintenance mem)后,接下来scan index一遍,回收垃圾index tuple。所以如果按ctid有序的情况下,如果回收的是整个duplicate keys,vacuum产生的wal更少(涉及的index page更少))。
举证
pg 11
```
postgres=# create table test_bt (id int, c1 int);
CREATE TABLE
postgres=# create index idx_test_bt_1 on test_bt (c1);
CREATE INDEX
postgres=# insert into test_bt select generate_series(1,10000000), random()*10000;
INSERT 0 10000000
Time: 44992.592 ms (00:44.993)
postgres=# \di+ idx_test_bt_1 ;
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+---------------+-------+----------+---------+--------+-------------
public | idx_test_bt_1 | index | postgres | test_bt | 281 MB |
(1 row)
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_bt where c1=2;
QUERY PLAN
Index Scan using idx_test_bt_1 on public.test_bt (cost=0.38..849.01 rows=50000 width=8) (actual time=0.034..2.011 rows=1062 loops=1)
Output: id, c1
Index Cond: (test_bt.c1 = 2)
Buffers: shared hit=1059
Planning Time: 0.132 ms
Execution Time: 2.083 ms
(6 rows)
```
pg 12
```
postgres=# create table test_bt (id int, c1 int);
CREATE TABLE
postgres=# create index idx_test_bt_1 on test_bt (c1);
CREATE INDEX
postgres=# insert into test_bt select generate_series(1,10000000), random()*10000;
INSERT 0 10000000
Time: 36449.404 ms (00:36.449)
postgres=# \di+ idx_test_bt_1
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+---------------+-------+----------+---------+--------+-------------
public | idx_test_bt_1 | index | postgres | test_bt | 235 MB |
(1 row)
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test_bt where c1=2;
QUERY PLAN
Index Scan using idx_test_bt_1 on public.test_bt (cost=0.43..39395.43 rows=50000 width=8) (actual time=0.033..1.141 rows=1002 loops=1)
Output: id, c1
Index Cond: (test_bt.c1 = 2)
Buffers: shared hit=998
Planning Time: 0.074 ms
Execution Time: 1.226 ms
(6 rows)
```
参考
https://www.postgresql.org/docs/12/release-12.html
https://www.postgresql.org/about/featurematrix/
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.
9.9元购买3个月阿里云RDS PostgreSQL实例
PostgreSQL 解决方案集合
德哥 / digoal's github - 公益是一辈子的事.





