暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 多字段任意组合搜索的性能

digoal 2017-11-02
297

作者

digoal

日期

2017-11-02

标签

PostgreSQL , 多字段搜索 , 任意字段搜索


背景

PostgreSQL 多字段,任意组合搜索,有三种优化手段:

1、gin索引(支持任意字段组合的查询)

《宝剑赠英雄 - 任意组合字段等效查询, 探探PostgreSQL多列展开式B树 (GIN)》

2、bloom索引(支持任意只读组合的等值查询)

《PostgreSQL 9.6 黑科技 bloom 算法索引,一个索引支撑任意列组合查询》

3、每个单列btree索引(支持任意字段组合的查询)

《PostgreSQL bitmapAnd, bitmapOr, bitmap index scan, bitmap heap scan》

例子

create table test(c1 int, c2 int, c3 int, c4 int, c5 int);

bloom, gin, multi-btree几种索引创建方法

1、bloom

```
postgres=# create extension bloom ;
CREATE EXTENSION
postgres=# create index idx_test12_1 on test12 using bloom (c1,c2,c3,c4,c5);
CREATE INDEX
postgres=# explain select * from test12 where c1=1;
QUERY PLAN


Bitmap Heap Scan on test12 (cost=13.95..20.32 rows=8 width=20)
Recheck Cond: (c1 = 1)
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..13.95 rows=8 width=0)
Index Cond: (c1 = 1)
(4 rows)
postgres=# explain select * from test12 where c1=1 and c2=1;
QUERY PLAN


Bitmap Heap Scan on test12 (cost=18.20..19.42 rows=1 width=20)
Recheck Cond: ((c1 = 1) AND (c2 = 1))
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..18.20 rows=1 width=0)
Index Cond: ((c1 = 1) AND (c2 = 1))
(4 rows)
postgres=# explain select * from test12 where c1=1 or c2=1;
QUERY PLAN


Bitmap Heap Scan on test12 (cost=27.91..38.16 rows=17 width=20)
Recheck Cond: ((c1 = 1) OR (c2 = 1))
-> BitmapOr (cost=27.91..27.91 rows=17 width=0)
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..13.95 rows=8 width=0)
Index Cond: (c1 = 1)
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..13.95 rows=8 width=0)
Index Cond: (c2 = 1)
(7 rows)
```

2、gin

```
postgres=# create extension btree_gin;
CREATE EXTENSION
postgres=# create index idx_test12_1 on test12 using gin (c1,c2,c3,c4,c5);
CREATE INDEX
postgres=# explain select * from test12 where c1=1 or c2=1;
QUERY PLAN


Bitmap Heap Scan on test12 (cost=4.94..15.19 rows=17 width=20)
Recheck Cond: ((c1 = 1) OR (c2 = 1))
-> BitmapOr (cost=4.94..4.94 rows=17 width=0)
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..2.46 rows=8 width=0)
Index Cond: (c1 = 1)
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..2.46 rows=8 width=0)
Index Cond: (c2 = 1)
(7 rows)

postgres=# explain select * from test12 where c1=1 and c2=1;
QUERY PLAN


Bitmap Heap Scan on test12 (cost=3.60..4.82 rows=1 width=20)
Recheck Cond: ((c1 = 1) AND (c2 = 1))
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..3.60 rows=1 width=0)
Index Cond: ((c1 = 1) AND (c2 = 1))
(4 rows)
```

3、multi-btree

```
postgres=# drop index idx_test12_1 ;
DROP INDEX
postgres=# create index idx_test12_1 on test12 using btree(c1);
CREATE INDEX
postgres=# create index idx_test12_2 on test12 using btree(c2);
CREATE INDEX
postgres=# create index idx_test12_3 on test12 using btree(c3);
CREATE INDEX
postgres=# create index idx_test12_4 on test12 using btree(c4);
CREATE INDEX
postgres=# create index idx_test12_5 on test12 using btree(c5);
CREATE INDEX

postgres=# explain select * from test12 where c1=1 and c2=1;
QUERY PLAN


Bitmap Heap Scan on test12 (cost=3.08..4.29 rows=1 width=20)
Recheck Cond: ((c2 = 1) AND (c1 = 1))
-> BitmapAnd (cost=3.08..3.08 rows=1 width=0)
-> Bitmap Index Scan on idx_test12_2 (cost=0.00..1.41 rows=8 width=0)
Index Cond: (c2 = 1)
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..1.41 rows=8 width=0)
Index Cond: (c1 = 1)
(7 rows)

postgres=# explain select * from test12 where c1=1 or c2=1;
QUERY PLAN


Bitmap Heap Scan on test12 (cost=2.83..13.09 rows=17 width=20)
Recheck Cond: ((c1 = 1) OR (c2 = 1))
-> BitmapOr (cost=2.83..2.83 rows=17 width=0)
-> Bitmap Index Scan on idx_test12_1 (cost=0.00..1.41 rows=8 width=0)
Index Cond: (c1 = 1)
-> Bitmap Index Scan on idx_test12_2 (cost=0.00..1.41 rows=8 width=0)
Index Cond: (c2 = 1)
(7 rows)
```

gin, bloom, btree bitmap scan的性能如何呢?

1600个列的宽表,任意字段组合搜索性能

1、建表

```
postgres=# do language plpgsql $$
declare
sql text;
begin
sql := 'create table test1 (';
for i in 1..1600 loop
sql := sql||' c'||i||' int2 default random()*100,';
end loop;
sql := rtrim(sql,',');
sql := sql||')';
execute sql;

for i in 1..1600 loop
execute 'create index idx_test1_'||i||' on test1 (c'||i||')';
end loop;
end;
$$;
DO
```

2、写入测试数据

postgres=# insert into test1 (c1) select generate_series(1,10000); INSERT 0 10000

3、测试脚本

```
vi test.sql

\set c2 random(1,100)
\set c3 random(1,100)
\set c4 random(1,100)
\set c5 random(1,100)
\set c6 random(1,100)
\set c7 random(1,100)
select c1600 from test1 where c2=:c2 and c3=:c3 and c4=:c4 or (c5=:c5 and c6=:c6 and c7=:c7);
```

4、测试

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120

5、性能

progress: 33.0 s, 208797.8 tps, lat 0.307 ms stddev 0.016 progress: 34.0 s, 208516.0 tps, lat 0.307 ms stddev 0.032 progress: 35.0 s, 208574.0 tps, lat 0.307 ms stddev 0.050 progress: 36.0 s, 208858.2 tps, lat 0.306 ms stddev 0.013 progress: 37.0 s, 208686.8 tps, lat 0.307 ms stddev 0.043 progress: 38.0 s, 208764.2 tps, lat 0.307 ms stddev 0.013

注意,使用prepared statement,可以减少硬解析,提高性能。

从测试来看,任意字段的搜索,可以达到0.3毫秒的响应。

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论