对于两个分区大表的连接,其性能与hash table的大小直接相关。如果hash table数据量能够被work_mem缓存,那性能是最优的。因此,如果能够利用分区的特点按分区进行hash join,那对于性能是非常大的提升。KingbaseES 有两个参数 enable_partitionwise_join , enable_partitionwise_aggregate 分别用于控制分区智能连接和聚集,默认这两个参数是 off 的。开启这两个参数,对于采用hash join 或 hash aggregate 的执行方式有性能提升。
以下我们举例来看该参数对于执行计划的影响:
一、分区哈希连接
1、准备数据
create table t1(id1 integer,name1 text) partition by hash(id1) partitions 4;
insert into t1 select generate_series(1,10000000),'abc'||generate_series(1,10000000);
create table t2(id2 integer,name1 text) partition by hash(id2) partitions 4;
insert into t2 select generate_series(1,3000000),'abc'||generate_series(1,3000000);2、查看执行计划
不使用智能连接:把所有分区的数据取回,以整个表为单位进行hash 及 hash join。因为innertable hash操作是整表,需要的内存更大。下例红色部分 Batches : 4,说明hash 的部分结果(3个batch)被临时存放在。
test=# show enable_partitionwise_join ;
enable_partitionwise_join
---------------------------
off
(1 行记录)
test=# explain analyze select count(*) from t1,t2 where id1=id2;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=5625429721.00..5625429721.01 rows=1 width=8) (actual time=8090.917..8090.922 rows=1 loops=1)
-> Hash Join (cost=110526.00..5250429721.00 rows=150000000000 width=0) (actual time=1032.591..7943.064 rows=3000000 loops=1)
Hash Cond: (t1_p0.id1 = t2_p0.id2)
-> Append (cost=0.00..204350.00 rows=10000000 width=4) (actual time=0.022..1666.767 rows=10000000 loops=1)
-> Seq Scan on t1_p0 (cost=0.00..38575.41 rows=2499241 width=4) (actual time=0.022..284.595 rows=2499241 loops=1)
-> Seq Scan on t1_p1 (cost=0.00..38593.88 rows=2500388 width=4) (actual time=0.010..282.833 rows=2500388 loops=1)
-> Seq Scan on t1_p2 (cost=0.00..38568.65 rows=2498765 width=4) (actual time=0.013..282.186 rows=2498765 loops=1)
-> Seq Scan on t1_p3 (cost=0.00..38612.06 rows=2501606 width=4) (actual time=0.011..275.778 rows=2501606 loops=1)
-> Hash (cost=61307.00..61307.00 rows=3000000 width=4) (actual time=1029.502..1029.504 rows=3000000 loops=1)
Buckets: 2097152 Batches: 4 Memory Usage: 42727kB
-> Append (cost=0.00..61307.00 rows=3000000 width=4) (actual time=0.019..452.904 rows=3000000 loops=1)
-> Seq Scan on t2_p0 (cost=0.00..11560.96 rows=748996 width=4) (actual time=0.018..72.361 rows=748996 loops=1)
-> Seq Scan on t2_p1 (cost=0.00..11591.35 rows=750935 width=4) (actual time=0.014..72.146 rows=750935 loops=1)
-> Seq Scan on t2_p2 (cost=0.00..11560.59 rows=748959 width=4) (actual time=0.017..74.721 rows=748959 loops=1)
-> Seq Scan on t2_p3 (cost=0.00..11594.10 rows=751110 width=4) (actual time=0.010..73.672 rows=751110 loops=1)
Planning Time: 0.107 ms
Execution Time: 8091.972 ms
(17 行记录)使用智能连接:因为两张表的分区模式是一样的(hash 分区,分区列的数据类型与分区数量相同),能够保证相同的数据hash到对应的分区。因此,只要对应的分区进行连接就可以。由于单个分区的数据量更少,更有可能使用内存排序。本例红色部分batches : 1,也就是所有的hash 的结果都在内存。
test=# set enable_partitionwise_join=on;
SET
test=# explain analyze select count(*) from t1,t2 where id1=id2;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=328157.00..328157.01 rows=1 width=8) (actual time=5805.663..5805.669 rows=1 loops=1)
-> Append (cost=20923.41..320657.00 rows=3000000 width=0) (actual time=269.617..5665.055 rows=3000000 loops=1)
-> Hash Join (cost=20923.41..76360.93 rows=748996 width=0) (actual time=269.616..1409.434 rows=748996 loops=1)
Hash Cond: (t1_p0.id1 = t2_p0.id2)
-> Seq Scan on t1_p0 (cost=0.00..38575.41 rows=2499241 width=4) (actual time=0.045..294.969 rows=2499241 loops=1)
-> Hash (cost=11560.96..11560.96 rows=748996 width=4) (actual time=265.047..265.048 rows=748996 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 34524kB
-> Seq Scan on t2_p0 (cost=0.00..11560.96 rows=748996 width=4) (actual time=0.097..81.051 rows=748996 loops=1)
-> Hash Join (cost=20978.04..76457.72 rows=750935 width=0) (actual time=237.966..1375.976 rows=750935 loops=1)
Hash Cond: (t1_p1.id1 = t2_p1.id2)
-> Seq Scan on t1_p1 (cost=0.00..38593.88 rows=2500388 width=4) (actual time=0.030..286.183 rows=2500388 loops=1)
-> Hash (cost=11591.35..11591.35 rows=750935 width=4) (actual time=235.646..235.646 rows=750935 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 34593kB
-> Seq Scan on t2_p1 (cost=0.00..11591.35 rows=750935 width=4) (actual time=0.025..69.625 rows=750935 loops=1)
-> Hash Join (cost=20922.58..76351.19 rows=748959 width=0) (actual time=245.015..1339.569 rows=748959 loops=1)
Hash Cond: (t1_p2.id1 = t2_p2.id2)
-> Seq Scan on t1_p2 (cost=0.00..38568.65 rows=2498765 width=4) (actual time=0.028..281.130 rows=2498765 loops=1)
-> Hash (cost=11560.59..11560.59 rows=748959 width=4) (actual time=242.801..242.801 rows=748959 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 34523kB
-> Seq Scan on t2_p2 (cost=0.00..11560.59 rows=748959 width=4) (actual time=0.024..70.538 rows=748959 loops=1)
-> Hash Join (cost=20982.97..76487.16 rows=751110 width=0) (actual time=255.132..1365.909 rows=751110 loops=1)
Hash Cond: (t1_p3.id1 = t2_p3.id2)
-> Seq Scan on t1_p3 (cost=0.00..38612.06 rows=2501606 width=4) (actual time=0.042..288.140 rows=2501606 loops=1)
-> Hash (cost=11594.10..11594.10 rows=751110 width=4) (actual time=252.233..252.233 rows=751110 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 34599kB
-> Seq Scan on t2_p3 (cost=0.00..11594.10 rows=751110 width=4) (actual time=0.060..71.905 rows=751110 loops=1)
Planning Time: 0.148 ms
Execution Time: 5812.038 ms
(28 行记录)3、分区哈希连接前提
- 关联表的hash分区模式必须完全相同:列类型(完全相同,integer 与 bigint 实测无法使用哈希智能连接)、长度必须相同、hash 分区数量必须相同。
- 对于hash 分区表,分区数量越多,每个分区数据量越少,在连接时,work_mem 能容纳单个分区数据的概率越高,越可能采用内存hash。
二、智能哈希聚集
1、构建测试数据
create table t1(id1 integer,name1 text) partition by hash(id1) partitions 4;
insert into t1 select generate_series(1,100000),'abc'||generate_series(1,100000);
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;2、非智能聚集的场景
test=# set enable_partitionwise_aggregate=off;
SET
test=# explain analyze select id1 , count(*) from t1 group by id1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=325567.00..326568.30 rows=100130 width=12) (actual time=6533.552..6567.385 rows=100000 loops=1)
Group Key: t1_p0.id1
-> Append (cost=0.00..261567.00 rows=12800000 width=4) (actual time=0.358..2707.037 rows=12800000 loops=1)
-> Seq Scan on t1_p0 (cost=0.00..49640.28 rows=3216128 width=4) (actual time=0.357..422.919 rows=3216128 loops=1)
-> Seq Scan on t1_p1 (cost=0.00..49347.84 rows=3197184 width=4) (actual time=0.037..457.075 rows=3197184 loops=1)
-> Seq Scan on t1_p2 (cost=0.00..49334.88 rows=3196288 width=4) (actual time=0.054..430.594 rows=3196288 loops=1)
-> Seq Scan on t1_p3 (cost=0.00..49244.00 rows=3190400 width=4) (actual time=0.056..510.354 rows=3190400 loops=1)
Planning Time: 0.111 ms
Execution Time: 6571.492 ms
(9 行记录)3、智能聚集的场景
test=# set enable_partitionwise_aggregate=on;
SET
test=# explain analyze select id1 , count(*) from t1 group by id1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Append (cost=65720.92..263069.22 rows=100148 width=12) (actual time=892.267..3502.275 rows=100000 loops=1)
-> HashAggregate (cost=65720.92..65971.54 rows=25062 width=12) (actual time=892.266..898.120 rows=25126 loops=1)
Group Key: t1_p0.id1
-> Seq Scan on t1_p0 (cost=0.00..49640.28 rows=3216128 width=4) (actual time=0.086..343.870 rows=3216128 loops=1)
-> HashAggregate (cost=65333.76..65585.86 rows=25210 width=12) (actual time=828.774..833.766 rows=24978 loops=1)
Group Key: t1_p1.id1
-> Seq Scan on t1_p1 (cost=0.00..49347.84 rows=3197184 width=4) (actual time=0.041..325.084 rows=3197184 loops=1)
-> HashAggregate (cost=65316.32..65566.86 rows=25054 width=12) (actual time=835.283..842.753 rows=24971 loops=1)
Group Key: t1_p2.id1
-> Seq Scan on t1_p2 (cost=0.00..49334.88 rows=3196288 width=4) (actual time=0.089..317.460 rows=3196288 loops=1)
-> HashAggregate (cost=65196.00..65444.22 rows=24822 width=12) (actual time=918.379..922.312 rows=24925 loops=1)
Group Key: t1_p3.id1
-> Seq Scan on t1_p3 (cost=0.00..49244.00 rows=3190400 width=4) (actual time=0.069..348.231 rows=3190400 loops=1)
Planning Time: 0.081 ms
Execution Time: 3505.180 ms
(15 行记录)最后修改时间:2024-08-20 16:45:01
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




