暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle 某行XXX系统SQL优化案例(一)

IT小Chen 2021-04-13
530

问题说明:

    数据库迁移到新服务器后,某一张视图查询缓慢,原库查询耗时3分钟,新库查询5小时以上。

    环境说明:

    新库:

    DB:Oracle 11.2.0.4.0 RAC

    OS:Redhat 7.6

    原库:

    DB:Oracle 11.2.0.4.0 单机

    OS:AIX 5.1

    问题分析:

      处理此类问题,常见思路如下:
      1 分别对比原库和新库该SQL对应的执行计划。
      2 对比两个执行计划,找到差异部分或最耗时的部分。
      3 分析执行计划差异原因,根据原因尝试解决问题。

      次案例视图引用的SQL使用了28个union all组成,关联了30多张表。

      在执行如下SQL时,新库会卡住很久,超过5小时。

        select count(*) from v_cjc_01;

        视图内SQL类似如下:

          create or replace view v_cjc_01 as
          select col1,col2,col3,...col10
          from t1
          where xxx=yyy
          union all
          select col1,col2,col3,...col10
          from t2
          where zzz=yyy
          union all
          ......
          union all
          select col1,col2,col3,...col10
          from t3
          where xxx=yyy and col2 not in (select col2 from t3 a where col1=aaa col3=bbb)
          union all
          select col1,col2,col3,...col10
          from t5
          where ttt=yyy;

          查看执行计划,发现主要有两处不同

          一:

          原执行计划 

            |* 40 |     HASH JOIN                  |                             | 72408 |  2616K|  2264K| 28026   (1)| 00:05:37 |
            | 41 | VIEW | VW_JF_SET$5EF83EA7 | 72408 | 1414K| | 3028 (1)| 00:00:37 |
            | 42 | UNION-ALL | | | | | | |
            |* 43 | TABLE ACCESS FULL | CJC_T1_00000000000000000 | 72407 | 2050K| | 1514 (1)| 00:00:19 |
            |* 44 | TABLE ACCESS FULL | CJC_T1_00000000000000000 | 1 | 29 | | 1514 (1)| 00:00:19 |
            | 45 | INDEX FAST FULL SCAN | IDX_MB_ACCINF_ACCNO | 9179K| 148M| | 12212 (1)| 00:02:27 |
            ----------------------------------------------------------------------------------------------------------------------

            新执行计划

              |  13 |     NESTED LOOPS               |                             |   236 |  9676 |  2003   (1)| 00:00:25 |
              |* 14 | TABLE ACCESS FULL | CJC_T1_00000000000000000 | 235 | 5640 | 1533 (1)| 00:00:19 |
              |* 15 | INDEX RANGE SCAN | IDX_MB_ACCINF_ACCNO | 1 | 17 | 2 (0)| 00:00:01 |
              | 16 | NESTED LOOPS | | 1 | 41 | 1535 (1)| 00:00:19 |
              |* 17 | TABLE ACCESS FULL | CJC_T1_00000000000000000 | 1 | 24 | 1533 (1)| 00:00:19 |
              |* 18 | INDEX RANGE SCAN | IDX_MB_ACCINF_ACCNO | 1 | 17 | 2 (0)| 00:00:01 |

              二:

              原执行计划

                |* 16 |     HASH JOIN RIGHT ANTI NA    |                             |  6647 |  1174K|    12M| 26593   (1)| 00:05:20 |
                |* 17 | TABLE ACCESS FULL | CJC_T2_00000000000000000 | 130K| 10M| | 11309 (1)| 00:02:16 |
                |* 18 | TABLE ACCESS FULL | CJC_T2_00000000000000000 | 664K| 60M| | 11309 (1)| 00:02:16 |

                新执行计划

                  |* 21 |     FILTER                     |                             |       |       |            |          |
                  |* 22 | TABLE ACCESS FULL | CJC_T2_00000000000000000 | 4928 | 202K| 11126 (1)| 00:02:14 |
                  |* 23 | TABLE ACCESS FULL | CJC_T2_00000000000000000 | 1 | 27 | 11124 (1)| 00:02:14 |

                    根据第一处不同,可以看到:
                    原执行计划,出现了VW_JF_SET$5EF83EA7,说明原库使用到了Join Factorization特性,
                    在该特性下,union all连接各个分支查询SQL时,会将各个分支部分中公共部分单独提出作为一个单独的结果集,
                    然后在和原union all剩下的部分做关联,从而减少了公共部分表重复扫描的问题,性能会有提升。

                    为什么新库没有使用到Join Factorization特性呢?

                      对比两个库的_OPTIMIZER_JOIN_FACTORIZATION参数,都是使用的默认值true
                      说明优化器认为新库不适合使用Join Factorization特性。
                      但是从执行计划看两种表数据量都比较大,结果集也很大,新库使用的NESTED LOOPS明显比原库HASH JOIN性能差。

                      可以通过10053事件对比查看原库和新库对同一SQL是如何进行改写,如何计算cost的?

                      原库和新库分别执行:

                        alter session set tracefile_identifier='10053';
                        ALTER SESSION SET EVENTS='10053 trace name context forever, level 1';
                        select count(*) from v_cjc_01;
                        ALTER SESSION SET EVENTS '10053 trace name context off';
                        select value from v$diag_info where name='Default Trace File';

                        查看生成的trace文件:

                          搜索关键字Final query after transformations

                          新库SQL改写部分:

                            UNION ALL
                            (SELECT '0' "KERNAL_INTER4"
                            FROM "CHEN"."CJC_T1_00000000000000000" "A",
                            CHENPMB."MB_ACCINF" "B"
                            WHERE "B"."AIF_ACCNO" = "A"."CJC_T3_000000"
                            AND "A"."MPB_BUSINESSTYPE" = '10'
                            AND "A"."MPB_TRANTIME" IS NOT NULL
                            AND LENGTH("A"."MPB_TRANTIME") = 14
                            AND "A"."MPB_TRANTIME" > '20200501000000') UNION ALL
                            (SELECT '0' "KERNAL_INTER4"
                            FROM "CHEN"."CJC_T1_00000000000000000" "A",
                            CHENPMB."MB_ACCINF" "B"
                            WHERE "B"."AIF_ACCNO" = "A"."CJC_T3_000000"
                            AND "A"."MPB_BUSINESSTYPE" = '11'
                            AND "A"."MPB_TRANTIME" IS NOT NULL
                            AND LENGTH("A"."MPB_TRANTIME") = 14
                            AND "A"."MPB_TRANTIME" > '20200501000000')

                            原库SQL改写部分:

                              UNION ALL
                              (SELECT '0' "KERNAL_INTER4"
                              FROM CHENPMB."EA_EACCT_INFO_S" "EA_EACCT_INFO_S"
                              WHERE "EA_EACCT_INFO_S"."EA_ACCOUNT_OPEN_RESULT" = '1') UNION ALL
                              (SELECT "VW_JF_SET$5EF83EA7"."ITEM_2" "KERNAL_INTER4"
                              FROM ((SELECT "A"."CJC_T3_000000" "ITEM_1", '0' "ITEM_2"
                              FROM "CHEN"."CJC_T1_00000000000000000" "A"
                              WHERE "A"."MPB_TRANTIME" > '20200501000000'
                              AND LENGTH("A"."MPB_TRANTIME") = 14
                              AND "A"."MPB_TRANTIME" IS NOT NULL
                              AND "A"."MPB_BUSINESSTYPE" = '10') UNION ALL
                              (SELECT "A"."CJC_T3_000000" "ITEM_1", '0' "ITEM_2"
                              FROM "CHEN"."CJC_T1_00000000000000000" "A"
                              WHERE "A"."MPB_TRANTIME" > '20200501000000'
                              AND LENGTH("A"."MPB_TRANTIME") = 14
                              AND "A"."MPB_TRANTIME" IS NOT NULL
                              AND "A"."MPB_BUSINESSTYPE" = '11')) "VW_JF_SET$5EF83EA7",
                              CHENPMB."MB_ACCINF" "B"
                              WHERE "B"."AIF_ACCNO" = "VW_JF_SET$5EF83EA7"."ITEM_1")) "MB_YINGXIAO_COUNT1"

                              尝试按照原库10053转换的SQL改写新库SQL后,执行速度恢复正常,但是SQL改动量较大。

                              根据第二处不同,可以看到:

                              改写前原SQL部分:

                                select CUSTOMERID as CSTNO,
                                TRANTIME as TRANTIME,
                                '10' as TRANTYPE,
                                to_date(TRANTIME, 'yyyymmddhh24miss') as TRANTIME1,
                                createtime,
                                1 as cnt,
                                0 as amt,
                                '0' as kernal_inter1,
                                '0' as kernal_inter2,
                                '0' as kernal_inter3,
                                '0' as kernal_inter4
                                from CJC_T2_00000000000000000 c
                                 where c.tranname = 'XX绑定'
                                and c.retcode = '0000'
                                and c.TRANTIME is not null
                                and length(c.TRANTIME) = 14
                                and c.TRANTIME > '20200501000000'
                                and c.medicalcardno not in
                                (select a.medicalcardno
                                from CJC_T2_00000000000000000 a
                                         where a.tranname = 'XX解绑'
                                and a.retcode = '0000')

                                新库SQL改写部分: 

                                  SELECT '0' "KERNAL_INTER4"
                                  FROM CHENPMB."CJC_T2_00000000000000000" "A",
                                  CHENPMB."CJC_T2_00000000000000000" "C"
                                            WHERE "C"."TRANNAME" = 'XX绑定'
                                  AND "C"."RETCODE" = '0000'
                                  AND "C"."TRANTIME" IS NOT NULL
                                  AND LENGTH("C"."TRANTIME") = 14
                                  AND "C"."TRANTIME" > '20200501000000'
                                  AND "C"."MEDICALCARDNO" = "A"."MEDICALCARDNO"
                                              AND "A"."TRANNAME" = 'XX解绑'
                                  AND "A"."RETCODE" = '0000'

                                  原库SQL改写部分:

                                    (SELECT '0' "KERNAL_INTER4"
                                    FROM CHENPMB."CJC_T2_00000000000000000" "C"
                                              WHERE "C"."TRANNAME" = 'XX绑定'
                                    AND "C"."RETCODE" = '0000'
                                    AND "C"."TRANTIME" IS NOT NULL
                                    AND LENGTH("C"."TRANTIME") = 14
                                    AND "C"."TRANTIME" > '20200501000000'
                                    AND NOT EXISTS
                                    (SELECT 0
                                    FROM CHENPMB."CJC_T2_00000000000000000" "A"
                                    WHERE LNNVL("A"."MEDICALCARDNO" <> "C"."MEDICALCARDNO")
                                                        AND "A"."TRANNAME" = 'XX解绑'
                                    AND "A"."RETCODE" = '0000'))

                                    根据执行计划可知,

                                      原库将not in改写成NOT EXISTS,表关联走的 HASH JOIN RIGHT ANTI NA,效率较高。
                                      尝试将新库SQL中not in改写成NOT EXISTS后,效率有明显提升,不到5分钟可以执行完成。

                                      SQL是否还有优化空间呢?

                                      可以看看Oracle给出的建议sql tuning advisor:

                                        SQL>
                                        DECLARE
                                        my_task_name VARCHAR2(30);
                                        BEGIN
                                        my_task_name := DBMS_SQLTUNE.CREATE_TUNING_TASK(sql_id => 'gzjy5nh7g0sfs',
                                        scope => 'COMPREHENSIVE',
                                        time_limit => 3600,
                                        task_name => 'cjc_sql_tuing_task',
                                        description => 'SQL TUNE TEST');
                                        DBMS_SQLTUNE.EXECUTE_TUNING_TASK(task_name => 'cjc_sql_tuing_task');
                                        END;
                                        /

                                          SELECT status FROM  DBA_ADVISOR_TASKS WHERE  task_name = 'cjc_sql_tuing_task';
                                          SET LONG 10000
                                          SET LONGCHUNKSIZE 1000
                                          SET LINESIZE 300
                                          SELECT DBMS_SQLTUNE.REPORT_TUNING_TASK('cjc_sql_tuing_task') FROM DUAL;

                                          建议如下:

                                          1- SQL Profile Finding (see explain plans section below)

                                            --------------------------------------------------------
                                            为此语句找到了性能更好的执行计划 2。选择以下 SQL 概要文件之一进行实施。
                                            Recommendation (estimated benefit: 95.14%)
                                            ------------------------------------------
                                            - 考虑接受推荐的 SQL 概要文件。
                                            execute dbms_sqltune.accept_sql_profile(task_name =>
                                            'cjc_sql_tuing_task', task_owner => 'CHENPMB', replace => TRUE);


                                            Recommendation (estimated benefit: 99.95%)
                                            ------------------------------------------
                                            - 考虑接受建议的 SQL 概要文件, 以便对此语句使用并行执行。
                                            execute dbms_sqltune.accept_sql_profile(task_name =>
                                            'cjc_sql_tuing_task', task_owner => 'CHENPMB', replace => TRUE,
                                            profile_type => DBMS_SQLTUNE.PX_PROFILE);


                                            与 DOP 128 并行执行此查询会使 SQL 概要文件计划上的响应时间缩短 99.13%。但是, 启用并行执行时要付出一些代价。它将增加语句的资源消耗
                                            (预计为 11.07%), 这会导致系统吞吐量降低。此外, 由于在非常短的持续时间内消耗了这些资源, 因此如果没有足够可用的硬件容量,
                                            并发语句的响应时间将受到负面影响。

                                            2- Index Finding (see explain plans section below)

                                              --------------------------------------------------
                                              通过创建一个或多个索引可以改进此语句的执行计划。


                                              Recommendation (estimated benefit: 99.7%)
                                              -----------------------------------------
                                              - 考虑运行可以改进物理方案设计的访问指导或者创建推荐的索引。
                                              create index CHENPMB.IDX$$_04000001 on
                                              CHENPMB.MB_PAY_LOG("MPL_TRANCODE","MPL_STT",LENGTH("MPL_TIME"),"MPL_TIME");


                                              - 考虑运行可以改进物理方案设计的访问指导或者创建推荐的索引。
                                              create index CHENPMB.IDX$$_04000002 on
                                              CHENPMB.MB_FINANCE_TRANS_INCREMENT("MFT_TRAN_STATUS",LENGTH("MFT_TRAN_TIME"
                                              ),"MFT_TRAN_CODE","MFT_TRAN_TIME");


                                              - 考虑运行可以改进物理方案设计的访问指导或者创建推荐的索引。
                                              create index CHEN.IDX$$_04000003 on CHEN.CJC_T1_00000000000000000("MPB_BUSI
                                              NESSTYPE",LENGTH("MPB_TRANTIME"),"MPB_TRANTIME","CJC_T3_000000");


                                              - 考虑运行可以改进物理方案设计的访问指导或者创建推荐的索引。
                                              create index CHENPMB.IDX$$_04000004 on
                                              CHENPMB.MB_LOANBYFINANCE_RECORD("MLR_TRANCODE","MLR_LOANTYPEFLAG","MLR_STAT
                                              US",LENGTH("MLR_TRANTIME"),"MLR_TRANTIME");


                                              - 考虑运行可以改进物理方案设计的访问指导或者创建推荐的索引。
                                              create index CHENPMB.IDX$$_04000006 on
                                              CHENPMB.MB_TRANFLOW_INCREMENT("TRF_BSNCODE","TRF_STT",LENGTH("TRF_SUBTIME")
                                              ,"TRF_SUBTIME","TRF_FLOWNO");


                                              Rationale
                                              ---------
                                              创建推荐的索引可以显著地改进此语句的执行计划。但是, 使用典型的 SQL 工作量运行 "访问指导"
                                              可能比单个语句更可取。通过这种方法可以获得全面的索引建议案, 包括计算索引维护的开销和附加的空间消耗。

                                              最终建议:

                                                1 将SQL中not in部分改写成not exists
                                                2 创建组合索引。

                                                优化效果:

                                                  SQL由原来5小时查不出结果,优化到12秒完成查询。

                                                  ###2021-03-31 22:00 chenjuchao###



                                                  文章转载自IT小Chen,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                                  评论