暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

一个普通的左连接给自己的总结

原创 杜伟 2023-08-10
204

生产上一个不是特别复杂的语句,相当于一个视图跟一个大表做左连接,大表的关联字段过滤性非常好,但是生产上这个语句的执行计划非常不稳,看能不能稳下。

语句类似这样的:select a.xxx,b.y from view a left join b on a.id=b.id;经分析,view处的结果集差异比较大,导致了整个sql执行计划不稳,

性能差的时候是对b表做全表扫描进行hash join outer连接,

性能高效时是结果集较小的时候,使用nested loops outer;

此时只能建议业务能不能增加条件来减小结果集。

是不是可以想到什么?针对上面的想法,那我们拿测试环境来试下:

---创建测试表收集统计信息,创建索引
create table dw0810 as select * from dba_objects; 
exec dbms_stats.gather_table_stats('SCOTT','DW0810'); 

create table dw0810a as select * from dba_objects; 
exec dbms_stats.gather_table_stats('SCOTT','DW0810A'); 

create index idx_dw0810 on dw0810(object_id); --表统计信息未锁定时,创建索引会自动收集统计信息 
create index idx_dw0810a on dw0810a(object_id);

select /*+ cardinality(a 100) */ * from dw0810 a left join dw0810a b on a.object_id=b.object_id ----此处的hint是让优化器认为a表的结果集是100,----此时F5的执行计划如下 
---------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows | Bytes | Cost | Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |  100 | 19600 |  544 | 00:00:07 |
|   1 |   NESTED LOOPS OUTER           |             |  100 | 19600 |  544 | 00:00:07 |
|   2 |    TABLE ACCESS FULL           | DW0810      |  100 |  9800 |  344 | 00:00:05 |
|   3 |    TABLE ACCESS BY INDEX ROWID | DW0810A     |    1 |    98 |    2 | 00:00:01 |
| * 4 |     INDEX RANGE SCAN           | IDX_DW0810A |    1 |       |    1 | 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 4 - access("A"."OBJECT_ID"="B"."OBJECT_ID"(+))


当结果集为200时就是HASH JOIN OUTER,因为我们的被驱动表不够大,所以差距不是很明显

select /*+ cardinality(a 200) */ * from dw0810 a left join dw0810a b on a.object_id=b.object_id 

 Plan Hash Value  : 1468256116 

----------------------------------------------------------------------------
| Id  | Operation            | Name    | Rows  | Bytes   | Cost | Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |         |   200 |   39200 |  688 | 00:00:09 |
| * 1 |   HASH JOIN OUTER    |         |   200 |   39200 |  688 | 00:00:09 |
|   2 |    TABLE ACCESS FULL | DW0810  |   200 |   19600 |  344 | 00:00:05 |
|   3 |    TABLE ACCESS FULL | DW0810A | 86314 | 8458772 |  344 | 00:00:05 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("A"."OBJECT_ID"="B"."OBJECT_ID"(+))



我们对分别对两个表加条件试试先对a表加条件:

select * from dw0810 a left join dw0810a b on a.object_id=b.object_id where a.object_id<100 

 Plan Hash Value  : 465372911 

---------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows | Bytes | Cost | Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |   95 | 18620 |    8 | 00:00:01 |
| * 1 |   HASH JOIN OUTER              |             |   95 | 18620 |    8 | 00:00:01 |
|   2 |    TABLE ACCESS BY INDEX ROWID | DW0810      |   95 |  9310 |    4 | 00:00:01 |
| * 3 |     INDEX RANGE SCAN           | IDX_DW0810  |   95 |       |    2 | 00:00:01 |
|   4 |    TABLE ACCESS BY INDEX ROWID | DW0810A     |   95 |  9310 |    4 | 00:00:01 |
| * 5 |     INDEX RANGE SCAN           | IDX_DW0810A |   95 |       |    2 | 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("A"."OBJECT_ID"="B"."OBJECT_ID"(+))
* 3 - access("A"."OBJECT_ID"<100)
* 5 - access("B"."OBJECT_ID"(+)<100)



对b表加条件
select * from dw0810 a left join dw0810a b on a.object_id=b.object_id where b.object_id<100 

 Plan Hash Value  : 2159939037 

---------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows | Bytes | Cost | Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |   95 | 18620 |    8 | 00:00:01 |
| * 1 |   HASH JOIN                    |             |   95 | 18620 |    8 | 00:00:01 |
|   2 |    TABLE ACCESS BY INDEX ROWID | DW0810      |   95 |  9310 |    4 | 00:00:01 |
| * 3 |     INDEX RANGE SCAN           | IDX_DW0810  |   95 |       |    2 | 00:00:01 |
|   4 |    TABLE ACCESS BY INDEX ROWID | DW0810A     |   95 |  9310 |    4 | 00:00:01 |
| * 5 |     INDEX RANGE SCAN           | IDX_DW0810A |   95 |       |    2 | 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("A"."OBJECT_ID"="B"."OBJECT_ID")
* 3 - access("A"."OBJECT_ID"<100)
* 5 - access("B"."OBJECT_ID"<100)


左连接时,如果能够对两表进行高效索引过滤时,最高效,如果主表无索引过滤,其结果集较小时,也可能使用上nl 让被驱动表走上索引,进行高效访问。左连接还有其他方式的表关联方式,就不多展示了。面对生产实际情况还是要多尝试,多总结,就如远邦老师说的,通过量化思维的方式,进行优化!

最后修改时间:2023-08-11 19:32:39
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论