SQL 标准使用left or right outer join 表示外连接,而Oracle 支持在nullable site 加 "(+)" 表示外连接。KingbaseES 语法上同时支持二者。在外连接的使用过程中,经常发现开发人员因为不了解语法而导致原本希望是outer join,而实际却是inner join(也就是发生了外连接消除)。以下举例说明。
一、构建例子
create table t1(id1 integer,name1 text);
create table t2(id2 integer,name2 text);
insert into t1 values(1,'a'),(2,'b'),(3,'c');
insert into t2 values(1,'aa'),(3,'cc');二、Outer Join 语法
外连接结果:
test=# select id1,id2 from t1 left join t2 on t1.id1=t2.id2;
id1 | id2
-----+-----
1 | 1
2 |
3 | 3
(3 行记录)1、Where 子句 Nullable-Side 条件,外连接可消除
以下SQL及结果、执行计划差异:
test=# select id1,id2 from t1 left join t2 on t1.id1=t2.id2 where name2='cc';
id1 | id2
-----+-----
3 | 3
(1 行记录)
test=# explain select id1,id2 from t1 left join t2 on t1.id1=t2.id2 where name2='cc';
QUERY PLAN
---------------------------------------------------------------
Hash Join (cost=25.95..53.79 rows=38 width=8)
Hash Cond: (t1.id1 = t2.id2)
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=4)
-> Hash (cost=25.88..25.88 rows=6 width=4)
-> Seq Scan on t2 (cost=0.00..25.88 rows=6 width=4)
Filter: (name2 = 'cc'::text)
(6 行记录)可以看到当where 条件包含nullable-side 的条件时(过滤条件),实际是inner join(外连接消除),而非outer join。这与用户的本意可能是不相符的。DBA 必须注意必须避免这种非本意的外连接消除的场景。
2、On 子句 Nullable-Side 条件,外连接不可消除
如果nullable-side条件放在on 部分,则实际是先进行过滤,再进行outer join,这是符合DBA预期的。如以下例子,实际还是outer join。
test=# select id1,id2 from t1 left join t2 on t1.id1=t2.id2 and name2='cc';
id1 | id2
-----+-----
1 |
2 |
3 | 3
(3 行记录)
test=# explain select id1,id2 from t1 left join t2 on t1.id1=t2.id2 and name2='cc';
QUERY PLAN
---------------------------------------------------------------
Hash Left Join (cost=25.95..53.79 rows=1270 width=8)
Hash Cond: (t1.id1 = t2.id2)
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=4)
-> Hash (cost=25.88..25.88 rows=6 width=4)
-> Seq Scan on t2 (cost=0.00..25.88 rows=6 width=4)
Filter: (name2 = 'cc'::text)
(6 行记录)3、Where 子句 Nullable-Side 条件,外连接不可消除的场景
对于where 子句包含nullable-side 条件,有一种情况,也不能进行外连接消除,就是 colx is null 情况,如:
test=# select id1,id2 from t1 left join t2 on t1.id1=t2.id2 where name2 is null;
id1 | id2
-----+-----
2 |
(1 行记录)
test=# explain select id1,id2 from t1 left join t2 on t1.id1=t2.id2 where name2 is null;
QUERY PLAN
-------------------------------------------------------------------
Merge Left Join (cost=176.34..303.67 rows=40 width=8)
Merge Cond: (t1.id1 = t2.id2)
Filter: (t2.name2 IS NULL)
-> Sort (cost=88.17..91.35 rows=1270 width=4)
Sort Key: t1.id1
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=4)
-> Sort (cost=88.17..91.35 rows=1270 width=36)
Sort Key: t2.id2
-> Seq Scan on t2 (cost=0.00..22.70 rows=1270 width=36)
(9 行记录)这种场景为什么无法进行外连接消除了?我们知道,where条件理论是在最后执行,具体到本例,也就是对连接的结果进行过滤。对于inner join,过滤与连接的先后对于结果没有影响。而对于外连接,先后顺序是会影响结果的。外连接能否消除,首先必须保证外连接消除后不影响结果。 对于例子where name2='cc',我们可以知道,保持outerjoin(最后过滤)与转成innerjoin(提前过滤)结果是一样的。而对于where name2 is null条件, 因为,nullable-side 返回的记录本来就是null(当nullable-side缺失对应记录时),保持outerjoin(最后过滤)与转成innerjoin(提前过滤)结果是完全不同的,因此,无法进行外连接消除。
4、NonNullable-Side 条件
test=# select * from t1 left join t2 on t1.id1=t2.id2 and name1='a'; --这个结果集需注意
id1 | name1 | id2 | name2
-----+-------+-----+-------
1 | a | 1 | aa
2 | b | |
3 | c | |
(3 行记录)
test=# select * from t1 left join t2 on t1.id1=t2.id2 where name1='a';
id1 | name1 | id2 | name2
-----+-------+-----+-------
1 | a | 1 | aa
(1 行记录)与nullable_side连接条件结果类似。这里 name1='a' 作为 non nullable_side 连接条件(on 部分)时,表示只有符合条件的记录才进行连接,其他不满足的,不进行连接,但outertable所有数据返回。
三、(+) 语法
1、相当于outerjoin的where部分条件
外连接是可以消除的场景:
test=# select id1,id2 from t1,t2 where t1.id1=t2.id2(+) and name2='cc';
id1 | id2
-----+-----
3 | 3
(1 行记录)
test=# explain select id1,id2 from t1,t2 where t1.id1=t2.id2(+) and name2='cc';
QUERY PLAN
---------------------------------------------------------------
Hash Join (cost=25.95..53.79 rows=38 width=8)
Hash Cond: (t1.id1 = t2.id2)
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=4)
-> Hash (cost=25.88..25.88 rows=6 width=4)
-> Seq Scan on t2 (cost=0.00..25.88 rows=6 width=4)
Filter: (name2 = 'cc'::text)
(6 行记录)
2、相当于outerjoin的on部分条件
外连接不可消除的场景:
test=# select id1,id2 from t1,t2 where t1.id1=t2.id2(+) and name2(+)='cc';
id1 | id2
-----+-----
1 |
2 |
3 | 3
(3 行记录)
test=# explain select id1,id2 from t1,t2 where t1.id1=t2.id2(+) and name2(+)='cc';
QUERY PLAN
---------------------------------------------------------------
Hash Left Join (cost=25.95..53.79 rows=1270 width=8)
Hash Cond: (t1.id1 = t2.id2)
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=4)
-> Hash (cost=25.88..25.88 rows=6 width=4)
-> Seq Scan on t2 (cost=0.00..25.88 rows=6 width=4)
Filter: (name2 = 'cc'::text)
(6 行记录)2、针对is null
where name2 is null 同样不能进行外连接消除。
test=# explain select id1,id2 from t1 ,t2 where t1.id1=t2.id2(+) and name2='cc';
QUERY PLAN
---------------------------------------------------------------
Hash Join (cost=25.95..53.79 rows=38 width=8)
Hash Cond: (t1.id1 = t2.id2)
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=4)
-> Hash (cost=25.88..25.88 rows=6 width=4)
-> Seq Scan on t2 (cost=0.00..25.88 rows=6 width=4)
Filter: (name2 = 'cc'::text)
(6 行记录)
test=# explain select id1,id2 from t1 ,t2 where t1.id1=t2.id2(+) and name2 is null;
QUERY PLAN
-------------------------------------------------------------------
Merge Left Join (cost=176.34..303.67 rows=40 width=8)
Merge Cond: (t1.id1 = t2.id2)
Filter: (t2.name2 IS NULL)
-> Sort (cost=88.17..91.35 rows=1270 width=4)
Sort Key: t1.id1
-> Seq Scan on t1 (cost=0.00..22.70 rows=1270 width=4)
-> Sort (cost=88.17..91.35 rows=1270 width=36)
Sort Key: t2.id2
-> Seq Scan on t2 (cost=0.00..22.70 rows=1270 width=36)




