

另一种方法是,我们可以考虑将全局索引关系基于分区键来分开存储,添加逻辑操作允许全局访问这些分开存储的全局索引关系(具有非分区键的惟一性限制)。在这种方法中,我们保持了原始分区表设计的优点。一个是一个关系的大小限制,因为我们有分开存储的全局索引关系,另一个是很容易保持分离操作的性能。对于附加分区,它将取决于它是一个空表还是一个包含数据甚至索引的表。
该方法的主要思想是通过消除必须包含分区键的限制来使用现有功能,并添加逻辑来处理索引构建期间对非分区键和交叉分区排序的全局唯一性检查。
4.1. 设置分区
首先,使用 pgbench 创建 12 个分区并加载规模为 1000 的数据。
$ pgbench -i -s 1000 --partitions=12 --partition-method=range -d postgres
4.2. 设置 全局 索引
其次,创建不带分区键(aid)限制的 全局 索引,但对非分区键(bid)进行全局惟一性检查。
postgres=# create unique index gidx on pgbench_accounts using btree(bid) global;
Here is how the schema looks like after a global index has been created.
postgres=# \d+ pgbench_accounts
Partitioned table "public.pgbench_accounts"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
----------+---------------+-----------+----------+---------+----------+-------------+--------------+-------------
aid | integer | | not null | | plain | | |
bid | integer | | | | plain | | |
abalance | integer | | | | plain | | |
filler | character(84) | | | | extended | | |
Partition key: RANGE (aid)
Indexes:
"pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
"gidx" UNIQUE, btree (bid)
Partitions: pgbench_accounts_1 FOR VALUES FROM (MINVALUE) TO (8333335),
pgbench_accounts_10 FOR VALUES FROM (75000007) TO (83333341),
pgbench_accounts_11 FOR VALUES FROM (83333341) TO (91666675),
pgbench_accounts_12 FOR VALUES FROM (91666675) TO (MAXVALUE),
pgbench_accounts_2 FOR VALUES FROM (8333335) TO (16666669),
...
postgres=# select oid, relname, relnamespace, reltype, reloftype, relam, relfilenode, relpages, reltuples, relhasindex, relkind from pg_class where relnamespace=2200 order by oid;
oid | relname | relnamespace | reltype | reloftype | relam | relfilenode | relpages | reltuples | relhasindex | relkind
-------+-----------------------------+--------------+---------+-----------+-------+-------------+----------+-----------+-------------+---------
16690 | gidx | 2200 | 0 | 0 | 403 | 0 | 0 | 0 | f | I
16691 | pgbench_accounts_1_bid_idx | 2200 | 0 | 0 | 403 | 16691 | 22852 | 8.333334e+06 | f | g
16692 | pgbench_accounts_2_bid_idx | 2200 | 0 | 0 | 403 | 16692 | 22852 | 8.333334e+06 | f | g
16693 | pgbench_accounts_3_bid_idx | 2200 | 0 | 0 | 403 | 16693 | 22852 | 8.333334e+06 | f | g
...
现在,让我们使用非分区键(bid)运行一个简单的查询来对比性能
postgres=# select * from pgbench_accounts where bid=75000007;
aid | bid | abalance | filler
----------+----------+----------+--------------------------------------------------------------------------------------
75000007 | 75000007 | 0 |
(1 row)
Time: 2.243 ms
postgres=# explain select * from pgbench_accounts where bid=75000007;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Append (cost=0.43..101.46 rows=12 width=97)
-> Index Scan using pgbench_accounts_1_bid_idx on pgbench_accounts_1 (cost=0.43..8.45 rows=1 width=97)
Index Cond: (bid = 75000007)
-> Index Scan using pgbench_accounts_2_bid_idx on pgbench_accounts_2 (cost=0.43..8.45 rows=1 width=97)
Index Cond: (bid = 75000007)
...
4.4. 不使用索引查询
然后,让我们删除这个全局索引,并再次运行相同的查询,
postgres=# drop index gidx;
DROP INDEX
postgres=# select * from pgbench_accounts where bid=75000007;
aid | bid | abalance | filler
----------+----------+----------+--------------------------------------------------------------------------------------
75000007 | 75000007 | 0 |
(1 row)
Time: 8345.590 ms (00:08.346)
postgres=# explain select * from pgbench_accounts where bid=75000007;
QUERY PLAN
----------------------------------------------------------------------------------------------
Gather (cost=1000.00..2161189.59 rows=12 width=97)
Workers Planned: 2
-> Parallel Append (cost=0.00..2160188.39 rows=12 width=97)
-> Parallel Seq Scan on pgbench_accounts_1 (cost=0.00..180015.78 rows=1 width=97)
Filter: (bid = 75000007)
-> Parallel Seq Scan on pgbench_accounts_2 (cost=0.00..180015.78 rows=1 width=97)
Filter: (bid = 75000007)
...
4.5. 使用具有分区键限制索引的查询
现在,让我们用分区键(aid)构建一个索引,
postgres=# create unique index lidx on pgbench_accounts using btree(aid, bid);
CREATE INDEX
postgres=# select * from pgbench_accounts where bid=75000007;
aid | bid | abalance | filler
----------+----------+----------+--------------------------------------------------------------------------------------
75000007 | 75000007 | 0 |
(1 row)
Time: 3312.177 ms (00:03.312)
postgres=# explain select * from pgbench_accounts where bid=75000007;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Append (cost=0.43..1846949.37 rows=12 width=97)
-> Index Scan using pgbench_accounts_1_aid_bid_idx on pgbench_accounts_1 (cost=0.43..153912.45 rows=1 width=97)
Index Cond: (bid = 75000007)
-> Index Scan using pgbench_accounts_2_aid_bid_idx on pgbench_accounts_2 (cost=0.43..153912.45 rows=1 width=97)
Index Cond: (bid = 75000007)
4.6. 非分区键的唯一性
下面的例子试图说明,如果不对全局索引提供的非分区键进行惟一性检查,则重复的 bid 记录的插入无法被检测到。
postgres=# insert into pgbench_accounts values(100000001, 75000007, 0, '');
INSERT 0 1
postgres=# select * from pgbench_accounts where bid=75000007;
aid | bid | abalance | filler
-----------+----------+----------+--------------------------------------------------------------------------------------
75000007 | 75000007 | 0 |
100000001 | 75000007 | 0 |
(2 rows)
postgres=# insert into pgbench_accounts values(100000001, 75000007, 0, ”);
ERROR: duplicate key value violates unique constraint “pgbench_accounts_10_bid_idx”
DETAIL: Key (bid)=(75000007) already exists.


文章转载自开源软件联盟PostgreSQL分会,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




