暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

执行日志(3)查看JOIN算子执行

原创 PEWSGGL 2022-11-28
532

3.3查看JOIN算子执行

本节主要介绍Join, Aggregate等算子的trace的详细信息,从中可分析 出算子是否并行执行,采用具体何种方法执行等信息。

Join算子目前有:

SortMerge Join :主要用于非等值连接,如t1. a > t2.b;

Hash Join:主要用于等值连接,内部又细分为RR, One-Pass, Hybrid三 种方法,如t1.a=t2. a;

NestLoop Join :以上其他方法都失效,走嵌套循环连接;

Complex Join:主要用于包含Or的连接,Or的两端可以是Sort Merge也 可以是Hash Join;

3. 3. 1 sort merge join

BEGIN Join

cnd(0):

cnd(0): t1.i < t2. j

prepare to use sort-merge

Joining sorters created fo

inner join (T0 - T1), rows.

cnd(0) Done(time used: 0.0

END Join(time used: 0.003s

3. 3. 2 Hash Join

等值连接走hash join。hash join有三种,一般情况下会走hybrid hash join

  1. 串行 One Pass Hash Join

BEGIN Join

cnd(0):

cnd(0): lineitem.L_ORDERKEY = orders. O_ORDERKEY

prepare          to          use          hash          join

等值连接走hash join

op buffer size: 16777216, tuple width: 16. op buffer can hold 1048576 rows 算子 buffer 大小 16777216,join 字段每行 16 字节,buffer

可以装1048576行

Begin one-pass hash partitioning: divide 1500000 tuples into 6 parts. mat_buf_size = 2796200 看到'one-pass hash partitioning'代表

走 one-pass hash join

mat partition thread: divide 1500000 tuples into 6 parts, min 241835
tuples, max 258974 tuples, avg 250000 tuples.

Finish one-pass hash partitioning: divide 1500000 tuples into 6 parts. mat_buf_size = 2796200. (time used: 2.104s)

Begin one-pass hash partitioning: divide 6001215 tuples into 6 parts. mat_buf_size = 2796200 对 join 的右边划分

Finish one-pass hash partitioning: divide 6001215 tuples into 6 parts. mat_buf_size = 2796200. (time used: 5.589s)

Finish One-Pass Hash Join preparation: divided each side into 6 partitions

Hash tree is used, size: 16777216

Begin     serial    rowid     merge-sorting:      6001215     rows

对join结果排序,这里serial可以sort可以看出join是串行的

Finish serial rowid merge-sorting: 6001215 rows. (time used: 5.806s) inner join(T0 - T1),   using hash join,                   produced 6001215 rows.

cnd(0) Done(time used: 39.754s)

END Join(time used: 39. 755s)

  1. 并行 One Pass Hash Join

BEGIN Join

prepare to use hash join

op buffer size: 16777216, tuple width: 16. op buffer can hold 1048576


rows

Begin one-pass hash partitioning: divide 1500000 tuples into 48 parts. mat_buf_size = 43688

(2)mat partition thread: divide 187500 tuples into 48 parts, min 3463 tuples, max 4331 tuples, avg 3906 tuples.

前面多出的(2)表示子线程2,表示这是并行执行的,如果是串行则只会打 印一行

( 6)mat partition thread:

tuples, max 4191 tuples,

( 3)mat partition thread:

tuples, max 4400 tuples,

( 5)mat partition thread:

tuples, max 4320 tuples,

divide 187500 tuples avg 3906 tuples.

divide 187500 tuples avg 3906 tuples.

divide 187500 tuples avg 3906 tuples.

into

into

into

48 parts,

48 parts,

48 parts,

min

min

min

3680

3413

3553







(7)mat partition thread: tuples, max 4169 tuples,

(0)mat partition thread: tuples, max 4393 tuples,

( 4)mat partition thread: tuples, max 4147 tuples,

( 1)mat partition thread: tuples, max 4408 tuples,

divide 187500 tuples avg 3906 tuples.

divide 187500 tuples avg 3906 tuples.

divide 187500 tuples avg 3906 tuples.

divide 187500 tuples avg 3906 tuples.

into  48 parts, min 3636

into  48 parts, min 3454

into  48 parts, min 3669

into  48 parts, min 3350




 


Finish one-pass hash partitioning: divide 1500000 tuples into 48 parts. mat_buf_size = 43688. (time used: 0.620s)

Begin rowid merge-sorting: 8 inputs, 6001215 rows 如果是串行会是Begin serial rowid merge-sorting

Finish rowid merge-sorting: 8 inputs, 6001215 rows(time used: 1.340s) Parallel join done. Merged tuple: 6001215 rows                                   这里也表示这是并

行执行的

cnd(0) Done(time used: 8.627s)

END Join(time used: 8. 631s)

  1. 串行 Round Robin Hash Join

BEGIN Join

cnd(0):

cnd(0): lineitem.L_ORDERKEY = orders. O_ORDERKEY


join

prepare          to          use          hash


RR 也是 hash join

Hash tree is used, size: 16777216 using multi pass hash join: 4 pass, 1500000 rows join 6001215 rows created hash on 493447/1500000 rows already. disable multi pass hash join scanned 2555904 rows, and produced 1974112 rows already.

created hash on 986894/1500000 rows already.

scanned 4653056 rows, and produced 3947396 rows already.

created hash on 1480341/1500000 rows already.

scanned 3576383 rows, and produced 5922493 rows already.


created hash on

all 1500000 rows already.


scanned 655360 rows,     and produced 6001215 rows already.

skipped 52.44% DCs by smart index.

inner join (T0 - T1),    using hash join,     produced 6001215 rows.

cnd(0) Done(time used: 17.236s)

END Join(time used: 17. 236s)

  1. 并行 Round Robin Hash Join

BEGIN Join

cnd(0):

cnd(0): lineitem.L_ORDERKEY = orders. O_ORDERKEY

prepare to use hash join

divide to 8 blocks(using round-robin): 786432, 720896, 786432, 720896,

786432, 720896, 786432, 692799. 对数据切分,并行join

( 5)Hash tree is used, size:

2097152                        (5)子线

程号,并行join

( 6)Hash tree is used, size:

2097152

( 0)Hash tree is used, size:

2097152

( 3)Hash tree is used, size:

2097152

( 2)Hash tree is used, size:

2097152

( 7)Hash tree is used, size:

2097152

( 1)Hash tree is used, size:

2097152


Parallel join done. Merged tuple: 6001215 rows              并行 join

cnd(0) Done(time used: 19.728s)

END Join(time used: 19.728s)                                  join 结

束,三种hash join中,只有rr hash join不需要排序

  1. 串行 Hybrid Hash Join

prepare to use hash join

traverse dim:     1500000 tuples, match dim:          6001215 tuples

—般取小表为traverse表,大表为match表

traverse tuple: total_size = 3, key_size = 3, tuple_size = 0, other_size


=0, pseudo_size = 0 参与join的宽度为3字节,hash字段3字节,其他 0字节

match tuple: total_size = 3, key_size = 3, tuple_size = 0, other_size = 0, pseudo_size = 0

Hybrid hash join:     data will be split into 3 partitions

‘Hybrid hash join'标识,分3次运行,第一次在内存,后面先落盘

traverse dim: 1500000 tuples, match dim: 6001215 tuples

BEGIN hybrid hash join: partition(-1/3): traverse dim: 1500000 tuples, match dim: 6001215 tuples -1/3代表第一轮,在内存进行

Traverse:     begin      scanning     from     0      to     1500000

创建hash表

HashTable: mem: total 16109940, desired 33000000, block_size 1048576 打印算子buffer,需要的内存,一个内存申请块大小

Sub 0: use 8388608+8008176; buckets: total 2097152, use 341165, avg 1.47, longest 9; rows: add 500511; count only table.

这里只有一^个Sub0标识是串行,use 8388608+8008176代表hash桶占用 8388608字节,hash字段占用8008176。共有2097152个桶,使用了 341165 个,平均冲突链1.47,最长9,共加载了 500511行,其中avg 一般在1~2 之间,如果太大说明数据分布不合理

JoinTables: mem: exp. 16777216; rows: exp. 1500000, cap. 720896, add 500511

Traverse       finished.       (time        used:        1.323s)

创建hash表完毕,进入Match阶段


used:         3.291s)

(time used:    4.619s)

Match         finished.          (time


Match阶段完毕,进入提交阶段

Commit: begin ...

END hybrid hash join:       partition(-1/3): 第一轮(-1/3)完毕

finish                      hash                      partitioning:

由于第一轮join是边join边划分的,所以第一轮join完毕后才能打印其他 2轮的统计信息 partition(0): traverse dim: 0 tuples, match dim: 0 tuples

partition(1):  traverse dim: 499688 tuples, match dim: 1999733 tuples

partition(2):  traverse dim: 499801 tuples, match dim: 1998529 tuples

END hybrid hash join: partition(0/3): (time used: 0.006s)

一般此轮为空,除非第一轮的算子buffer没装下所有的traverse表,此时终 止进hash表,让所有第一轮的数据先落盘,然后其与后面的轮次操作一样操 作

END  hybrid   hash   join:  partition(1/3):    (time  used:   0.825s)

后面一轮结束

END  hybrid   hash   join:  partition(2/3):    (time  used:   0.820s)

最后一轮结束

join: found 6001215 rows. (inner: 6001215, traverse outer: 0, match outer: 0)

inner join(T0 - T1), using hash join, produced 0 rows. cnd(0) Done(time used: 6.294s)

END Join(time used: 6.294s)

3.3.2.6 并行 Hybrid Hash Join

prepare to use hash join

traverse dim: 1500000 tuples, match dim: 6001215 tuples

traverse tuple: total_size = 7, key_size = 3, tuple_size = 4, other_size = 0, pseudo_size = 0

match tuple: total_size = 7, key_size = 3, tuple_size = 4, other_size = 0, pseudo_size = 0

divide to 8 blocks(using round-robin): 786432, 720896, 786432, 720896, 786432, 720896, 786432, 692799.

Hybrid hash join: data will be split into 2 partitions

BEGIN hybrid hash join: partition(-1/2) : traverse dim: 1500000 tuples, match dim: 6001215 tuples

(       5)Traverse:        hash      table       is     full.

hash表满,在-1轮出现意味着将装入第0轮操作

(1)HashTable: mem: total 2009770, desired 3375000, block_size 1572864

(1)Sub 0: use 1048576+1121100; buckets: total 262144, use 78921, avg 1.18, longest 5; rows: add 93425 多个Sub线程,代表这是并行

(1)Sub 1: use 1048576+1123404; buckets: total 262144, use 78908, avg

1.19, longest 5; rows: add 93617 (1)JoinTables: mem: exp. 16777216; rows: exp. 1500000, cap. 6990504, add 655933

  1. Traverse finished. (time used: 1.389s)

END hybrid hash join: partition(T/2): (time used: 4.759s)

finish hash partitioning:

partition(0): traverse dim: 750122 tuples, match dim: 3000792 tuples partition(1): traverse dim: 749878 tuples, match dim: 3000423 tuples

BEGIN hybrid hash join: partition(0/2): traverse dim: 750122 tuples, match dim: 3000792 tuples

( 4)HashTable: mem: total 2009770, desired 3375000, block_size 1572864

(      7)Match      finished.      (time      used:      1.901s)

无论在哪个partition发生多趟,都会多次进行traveser->match->commit

  1. HashTable: mem: total 2009770, desired 3375000, block_size 1572864

(0)Match finished. (time used: 1.529s)

END hybrid hash join: partition(0/2): (time used: 3.800s)

END hybrid hash join: partition(1/2): (time used: 4.211s)

join: found 6001215 rows. (inner: 6001215, traverse outer: 0, match outer: 0)

Begin rowid merge-sorting:           8    inputs,   6001215 rows

join后要排序

Finish rowid merge-sorting: 8 inputs, 6001215 rows(time used: 1.628s)

Parallel    join    done.    Merged     tuple:    6001215    rows

并行join结束

inner join(T0 - T1), using hash join,      produced 6001215 rows.

cnd(0) Done(time used: 14.462s)

3. 3. 3 Nest Loop Join

因为各种原因不能进行上面的join时,会转入最原始的Nest Loop Join 当trace里出现

nested-loop join 时表示进入此join


应尽量避免进入这种模式 下面的trace代表complex join。complex join本身不算一种join算法

cnd(0):

cnd(0):    (tl.a = t2.c) or (tl.b = t2.d)

prepare to use complex join                             存在 or

条件,一般会走complex join,先对or两边分别join,然后再取or

cnd(0):    t1.a = t2. c

prepare to use hash join                                     先做左边

的 hash join

cnd(0) Done(time used: 0.003s)

cnd(0):    t1.b = t2. d

prepare to use hash join                                     再做右

边的 hash join

cnd(0) Done(time used: 0.002s)

BEGIN Or(1 rows Or 1 rows)                                   对左右

两边结果取or

END Or (serial): produced 1 rows (time used: 0.000s)

cnd(0) Done(time used: 0.007s)

END Join(time used: 0. 007s)                                    整个

complex join 结束

3. 3.4 JOIN相关参数

A gbase_buffer_hj

算子 buffer 大小,对应 trace 里 JoinTables: mem: exp.xxx

A _gbase_hybrid_hash_joi n

0:不使用1:使用,此时不会考虑其他方式       2:内部调试,不建议开

A _gbase_one_pass_hash_join

_gbase_hybrid_hash_join 为 0 时,1: 使用 one-pass,       0 :使用 RR

A gbase_parallel_execution

1:并行 0:串行

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论