在多表联接的场景中,优化器的一个很重要的任务是决定各个表之间的联接顺序(Join Order),因为不同的联接顺序会影响中间结果集的大小,进而影响到计划整体的执行代价。
为了减少执行计划的搜索空间和计划执行的内存占用,OceanBase 数据库优化器在生成联接顺序时主要考虑左深树的联接形式。下图展示了左深树、右深树和多支树的计划形状。

OceanBase 数据库在生成联接顺序时采用 System-R 的动态规划算法,考虑的因素包括每一个表可能的访问路径、Interesting Order、联接算法(Nested Loop Join、Block Based Nested Loop Join 或者 Sort Merge Join 等)以及不同表之间的联接选择率等。
如果给定 N 个表的联接,OceanBase 数据库生成联接顺序的方法如下:
为每一个基表生成访问路径,保留代价最小的访问路径以及有所有具有 Interesting Order 的路径。如果一个路径具有 Interesting Order,它的序能够被后续的算子使用。
生成所有表集合的大小为
i (1 < i <= N)的计划。 OceanBase 数据库一般只考虑左深树,表集合大小为i的计划可以由其本身的计划和一个基表的计划组成。OceanBase 数据库按照这种策略,考虑了所有的联接算法以及 Interesting Order 的继承等因素生成所有表集合大小为i的计划。这里也只是保留代价最小的计划以及所有具有 Interesting Order 的计划。
同时,OceanBase 数据库提供了 Hint 机制 /*+LEADING(table_name_list)*/ 去控制多表联接的顺序。
如下例所示,开始选择的联接顺序是先做 t1 表、t2 表的 JOIN 联接,然后再和 t3 表做 JOIN 联接;如果用户希望先做 t2 表、t3 表的 JOIN 联接,然后再和 t1 表做 JOIN 联接,则可以使用 Hint /*+LEADING(t2,t3,t1)*/ 去控制;如果用户希望先做 t1 表、t3 表的 JOIN 联接,然后再和 t2 表做 JOIN 联接,则可以使用 Hint /*+LEADING(t1,t3,t2)*/ 去控制。
在多表联接的场景中,优化器的一个很重要的任务是决定各个表之间的联接顺序(Join Order),因为不同的联接顺序会影响中间结果集的大小,进而影响到计划整体的执行代价。
为了减少执行计划的搜索空间和计划执行的内存占用,OceanBase 数据库优化器在生成联接顺序时主要考虑左深树的联接形式。下图展示了左深树、右深树和多支树的计划形状。
OceanBase 数据库在生成联接顺序时采用 System-R 的动态规划算法,考虑的因素包括每一个表可能的访问路径、Interesting Order、联接算法(Nested Loop Join、Block Based Nested Loop Join 或者 Sort Merge Join 等)以及不同表之间的联接选择率等。
如果给定 N 个表的联接,OceanBase 数据库生成联接顺序的方法如下:
为每一个基表生成访问路径,保留代价最小的访问路径以及有所有具有 Interesting Order 的路径。如果一个路径具有 Interesting Order,它的序能够被后续的算子使用。
生成所有表集合的大小为
i (1 < i <= N)的计划。 OceanBase 数据库一般只考虑左深树,表集合大小为i的计划可以由其本身的计划和一个基表的计划组成。OceanBase 数据库按照这种策略,考虑了所有的联接算法以及 Interesting Order 的继承等因素生成所有表集合大小为i的计划。这里也只是保留代价最小的计划以及所有具有 Interesting Order 的计划。同时,OceanBase 数据库提供了 Hint 机制
/*+LEADING(table_name_list)*/去控制多表联接的顺序。如下例所示,开始选择的联接顺序是先做
t1表、t2表的JOIN联接,然后再和t3表做JOIN联接;如果用户希望先做t2表、t3表的JOIN联接,然后再和t1表做JOIN联接,则可以使用 Hint/*+LEADING(t2,t3,t1)*/去控制;如果用户希望先做t1表、t3表的JOIN联接,然后再和t2表做JOIN联接,则可以使用 Hint/*+LEADING(t1,t3,t2)*/去控制。obclient> CREATE TABLE t1(c1 INT, c2 INT, PRIMARY KEY(c1)); Query OK, 0 rows affected obclient> CREATE TABLE t2(c1 INT, c2 INT, PRIMARY KEY(c1)); Query OK, 0 rows affected obclient> CREATE TABLE t3(c1 INT, c2 INT, PRIMARY KEY(c1)); Query OK, 0 rows affected obclient> EXPLAIN SELECT * FROM t1,t2,t3 WHERE t1.c1 = t2.c2 AND t2.c1 = t3.c2; +-----------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------+ | ======================================= |ID|OPERATOR |NAME|EST. ROWS|COST | --------------------------------------- |0 |HASH JOIN | |98010 |926122| |1 | TABLE SCAN |T3 |100000 |61860 | |2 | HASH JOIN | |99000 |494503| |3 | TABLE SCAN|T1 |100000 |61860 | |4 | TABLE SCAN|T2 |100000 |61860 | ======================================= Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T2.C1 = T3.C2]), other_conds(nil) 1 - output([T3.C2], [T3.C1]), filter(nil), access([T3.C2], [T3.C1]), partitions(p0) 2 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), equal_conds([T1.C1 = T2.C2]), other_conds(nil) 3 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0) 4 - output([T2.C2], [T2.C1]), filter(nil), access([T2.C2], [T2.C1]), partitions(p0) obclient> EXPLAIN SELECT /*+LEADING(t2,t3,t1)*/* FROM t1,t2,t3 WHERE t1.c1 = t2.c2 AND t2.c1 = t3.c2; +-----------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------+ | ======================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ---------------------------------------- |0 |HASH JOIN | |98010 |1096613| |1 | HASH JOIN | |99000 |494503 | |2 | TABLE SCAN|T2 |100000 |61860 | |3 | TABLE SCAN|T3 |100000 |61860 | |4 | TABLE SCAN |T1 |100000 |61860 | ======================================== Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T1.C1 = T2.C2]), other_conds(nil) 1 - output([T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T2.C1 = T3.C2]), other_conds(nil) 2 - output([T2.C2], [T2.C1]), filter(nil), access([T2.C2], [T2.C1]), partitions(p0) 3 - output([T3.C2], [T3.C1]), filter(nil), access([T3.C2], [T3.C1]), partitions(p0) 4 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0) obclient> EXPLAIN SELECT /*+LEADING(t1,t3,t2)*/* FROM t1,t2,t3 WHERE t1.c1 = t2.c2 AND t2.c1 = t3.c2; +-----------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------+ | ============================================================= |ID|OPERATOR |NAME|EST. ROWS |COST | ------------------------------------------------------------- |0 |HASH JOIN | |98010 |53098071243| |1 | NESTED-LOOP JOIN CARTESIAN| |10000000000|7964490204 | |2 | TABLE SCAN |T1 |100000 |61860 | |3 | MATERIAL | |100000 |236426 | |4 | TABLE SCAN |T3 |100000 |61860 | |5 | TABLE SCAN |T2 |100000 |61860 | ============================================================= Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T1.C1 = T2.C2], [T2.C1 = T3.C2]), other_conds(nil) 1 - output([T1.C1], [T1.C2], [T3.C1], [T3.C2]), filter(nil), conds(nil), nl_params_(nil) 2 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0) 3 - output([T3.C1], [T3.C2]), filter(nil) 4 - output([T3.C2], [T3.C1]), filter(nil), access([T3.C2], [T3.C1]), partitions(p0) 5 - output([T2.C2], [T2.C1]), filter(nil), access([T2.C2], [T2.C1]), partitions(p0)obclient> CREATE TABLE t1(c1 INT, c2 INT, PRIMARY KEY(c1)); Query OK, 0 rows affected obclient> CREATE TABLE t2(c1 INT, c2 INT, PRIMARY KEY(c1)); Query OK, 0 rows affected obclient> CREATE TABLE t3(c1 INT, c2 INT, PRIMARY KEY(c1)); Query OK, 0 rows affected obclient> EXPLAIN SELECT * FROM t1,t2,t3 WHERE t1.c1 = t2.c2 AND t2.c1 = t3.c2; +-----------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------+ | ======================================= |ID|OPERATOR |NAME|EST. ROWS|COST | --------------------------------------- |0 |HASH JOIN | |98010 |926122| |1 | TABLE SCAN |T3 |100000 |61860 | |2 | HASH JOIN | |99000 |494503| |3 | TABLE SCAN|T1 |100000 |61860 | |4 | TABLE SCAN|T2 |100000 |61860 | ======================================= Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T2.C1 = T3.C2]), other_conds(nil) 1 - output([T3.C2], [T3.C1]), filter(nil), access([T3.C2], [T3.C1]), partitions(p0) 2 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2]), filter(nil), equal_conds([T1.C1 = T2.C2]), other_conds(nil) 3 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0) 4 - output([T2.C2], [T2.C1]), filter(nil), access([T2.C2], [T2.C1]), partitions(p0) obclient> EXPLAIN SELECT /*+LEADING(t2,t3,t1)*/* FROM t1,t2,t3 WHERE t1.c1 = t2.c2 AND t2.c1 = t3.c2; +-----------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------+ | ======================================== |ID|OPERATOR |NAME|EST. ROWS|COST | ---------------------------------------- |0 |HASH JOIN | |98010 |1096613| |1 | HASH JOIN | |99000 |494503 | |2 | TABLE SCAN|T2 |100000 |61860 | |3 | TABLE SCAN|T3 |100000 |61860 | |4 | TABLE SCAN |T1 |100000 |61860 | ======================================== Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T1.C1 = T2.C2]), other_conds(nil) 1 - output([T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T2.C1 = T3.C2]), other_conds(nil) 2 - output([T2.C2], [T2.C1]), filter(nil), access([T2.C2], [T2.C1]), partitions(p0) 3 - output([T3.C2], [T3.C1]), filter(nil), access([T3.C2], [T3.C1]), partitions(p0) 4 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0) obclient> EXPLAIN SELECT /*+LEADING(t1,t3,t2)*/* FROM t1,t2,t3 WHERE t1.c1 = t2.c2 AND t2.c1 = t3.c2; +-----------------------------------------------------------------+ | Query Plan | +-----------------------------------------------------------------+ | ============================================================= |ID|OPERATOR |NAME|EST. ROWS |COST | ------------------------------------------------------------- |0 |HASH JOIN | |98010 |53098071243| |1 | NESTED-LOOP JOIN CARTESIAN| |10000000000|7964490204 | |2 | TABLE SCAN |T1 |100000 |61860 | |3 | MATERIAL | |100000 |236426 | |4 | TABLE SCAN |T3 |100000 |61860 | |5 | TABLE SCAN |T2 |100000 |61860 | ============================================================= Outputs & filters: ------------------------------------- 0 - output([T1.C1], [T1.C2], [T2.C1], [T2.C2], [T3.C1], [T3.C2]), filter(nil), equal_conds([T1.C1 = T2.C2], [T2.C1 = T3.C2]), other_conds(nil) 1 - output([T1.C1], [T1.C2], [T3.C1], [T3.C2]), filter(nil), conds(nil), nl_params_(nil) 2 - output([T1.C1], [T1.C2]), filter(nil), access([T1.C1], [T1.C2]), partitions(p0) 3 - output([T3.C1], [T3.C2]), filter(nil) 4 - output([T3.C2], [T3.C1]), filter(nil), access([T3.C2], [T3.C1]), partitions(p0) 5 - output([T2.C2], [T2.C1]), filter(nil), access([T2.C2], [T2.C1]), partitions(p0)




