Oracle 论基数偏差

ASKTOM 2020-06-11

504

问题描述

团队:

请参见下面的测试用例，当我们对谓词中的值进行硬编码时，估计基数的优化器是正确的。

但是，当我们使用另一个表中的值进行联接时，基数的估计值与实际值有很大差异

问题:
1) 为什么我们加入时基数会偏离？优化器不能提取t2.start _ time和t2.end _ time列统计信息并优化sql产生正确的基数以执行吗？(询问的原因是我们在应用程序中获得了类似的sql，其中估计的基数变得具有误导性，并且优化器正在获取不正确的索引以执行)

2) 我可以在此处为优化器提供哪些其他信息，以提供以下联接的正确基数？

drop table t1 purge;
drop table t2 purge;

create table t1 as select * from all_objects;
create index t1_idx on t1( created );
create table t2 ( start_time date, end_time date );
insert into t2 values( to_date('13-nov-2019','dd-mon-yyyy'), to_date('13-nov-2019 22:00','dd-mon-yyyy hh24:mi'));
commit;

exec dbms_stats.gather_table_stats(user,'T2');
exec dbms_stats.gather_table_stats(user,'T1',method_opt=>'for columns created size 2048');

demo@PDB1> set autotrace traceonly explain statistics
demo@PDB1> select *
  2  from t1
  3  where created between to_date('13-nov-2019','dd-mon-yyyy')
  4  and to_date('13-nov-2019 22:00','dd-mon-yyyy hh24:mi');


Execution Plan
----------------------------------------------------------
Plan hash value: 1775246573

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     1 |   135 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     1 |   135 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_IDX |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CREATED">=TO_DATE(' 2019-11-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "CREATED"<=TO_DATE(' 2019-11-13 22:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
       2472  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

demo@PDB1> select *
  2  from t1, t2
  3  where t1.created between t2.start_time and t2.end_time;


Execution Plan
----------------------------------------------------------
Plan hash value: 2010545385

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        | 10060 |  1483K|     6   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |        | 10060 |  1483K|     6   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |        | 10060 |  1483K|     6   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | T2     |     1 |    16 |     3   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | T1_IDX |    92 |       |     1   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1     | 10060 |  1326K|     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."CREATED">="T2"."START_TIME" AND
              "T1"."CREATED"<="T2"."END_TIME")


Statistics
----------------------------------------------------------
         59  recursive calls
          0  db block gets
         71  consistent gets
          0  physical reads
          0  redo size
       2630  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
          1  rows processed

demo@PDB1>

专家解答

联接 (如上面的联接) 更类似于绑定变量成本计算，即，我们事先不知道传入的 “参数” 的值。你可以看到成本与绑定相当

SQL> drop table t1 purge;

Table dropped.

SQL> drop table t2 purge;

Table dropped.

SQL>
SQL> create table t1 as select * from dba_objects;

Table created.

SQL> create index t1_idx on t1( created );

Index created.

SQL> select max(created) from t1;

MAX(CREAT
---------
12-JUN-20

1 row selected.

SQL> select min(created) from t1;

MIN(CREAT
---------
30-MAY-19

1 row selected.

SQL>
SQL> create table t2 ( start_time date, end_time date );

Table created.

SQL> insert into t2 values( to_date('26-may-2020','dd-mon-yyyy'), to_date('26-may-2020 22:00','dd-mon-yyyy hh24:mi'));

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_stats.gather_table_stats(user,'T2');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user,'T1',method_opt=>'for columns created size 2048');

PL/SQL procedure successfully completed.

SQL>
SQL> set autotrace traceonly explain
SQL> select *
  2      from t1
  3      where created between to_date('26-may-2020','dd-mon-yyyy')
  4      and to_date('26-may-2020 22:00','dd-mon-yyyy hh24:mi');

Execution Plan
----------------------------------------------------------
Plan hash value: 1775246573

----------------------------------------------------------------------------------------------
| Id  | Operation                           | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |        |     5 |   660 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1     |     5 |   660 |     3   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_IDX |     5 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CREATED">=TO_DATE(' 2020-05-26 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "CREATED"<=TO_DATE(' 2020-05-26 22:00:00', 'syyyy-mm-dd hh24:mi:ss'))

SQL>
SQL> select *
  2      from t1, t2
  3      where t1.created between t2.start_time and t2.end_time;

Execution Plan
----------------------------------------------------------
Plan hash value: 2010545385

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |  3665 |   529K|    10   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                |        |  3665 |   529K|    10   (0)| 00:00:01 |
|   2 |   NESTED LOOPS               |        |  3665 |   529K|    10   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL         | T2     |     1 |    16 |     3   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | T1_IDX |   248 |       |     1   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1     |  3665 |   472K|     7   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."CREATED">="T2"."START_TIME" AND
              "T1"."CREATED"<="T2"."END_TIME")

SQL>
SQL> set autotrace off
SQL>
SQL> explain plan for select *
  2      from t1
  3      where created between :1 and :2;

Explained.

SQL>
SQL>
SQL> select dbms_xplan.display_plan() from dual;

DBMS_XPLAN.DISPLAY_PLAN()
----------------------------------------------------------------------------------------------------
 Plan Hash Value  : 4096079515

------------------------------------------------------------------------------------------
| Id  | Operation                              | Name   | Rows | Bytes | Cost | Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |        |  206 | 27192 |   10 | 00:00:01 |
| * 1 |   FILTER                               |        |      |       |      |          |
|   2 |    TABLE ACCESS BY INDEX ROWID BATCHED | T1     |  206 | 27192 |   10 | 00:00:01 |
| * 3 |     INDEX RANGE SCAN                   | T1_IDX |  370 |       |    2 | 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - filter(TO_DATE(:2)>=TO_DATE(:1))
* 3 - access("CREATED">=:1 AND "CREATED"<=:2)


1 row selected.

如果您希望优化器了解更多...然后，您通常必须要求它更深入地挖掘数据，例如


SQL>
SQL> set feedback only
SQL> select /*+ dynamic_sampling(11) gather_plan_statistics */ *
  2      from t1, t2
  3      where t1.created between t2.start_time and t2.end_time;

5 rows selected.

SQL> set feedback on
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  06f9basp2rfdt, child number 0
-------------------------------------
select /*+ dynamic_sampling(11) gather_plan_statistics */ *     from
t1, t2     where t1.created between t2.start_time and t2.end_time

Plan hash value: 2010545385

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |      1 |        |      5 |00:00:00.01 |      11 |
|   1 |  NESTED LOOPS                |        |      1 |      5 |      5 |00:00:00.01 |      11 |
|   2 |   NESTED LOOPS               |        |      1 |    248 |      5 |00:00:00.01 |       9 |
|   3 |    TABLE ACCESS FULL         | T2     |      1 |      1 |      1 |00:00:00.01 |       6 |
|*  4 |    INDEX RANGE SCAN          | T1_IDX |      1 |    248 |      5 |00:00:00.01 |       3 |
|   5 |   TABLE ACCESS BY INDEX ROWID| T1     |      5 |      5 |      5 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."CREATED">="T2"."START_TIME" AND "T1"."CREATED"<="T2"."END_TIME")

Note
-----
   - dynamic statistics used: dynamic sampling (level=0)


27 rows selected.

SQL>
SQL>
SQL>

oracle asktom

文章转载自ASKTOM，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

Oracle 论基数偏差

问题描述

专家解答

评论