原文作者:Jonathan lewis
原文地址:https://jonathanlewis.wordpress.com/2020/04/23/date_to_date/
date_to_date
— Jonathan Lewis @ 12:00 pm BST Apr 23,2020
有时会有人在Oracle开发者论坛上发布一条SQL语句,其中包含一个带有to_date(date_column)表达式的谓词。这是一个由各种原因导致的问题——不仅仅在几年前的(刚刚被重新提起的)一篇文章中出现的性能问题。
在我调查性能详情之前,这里有一个简单的“错误数据”问题的演示,它可能被忽视,从一个SQL*Plus的12.2.0.1会话中剪切粘贴:
SQL> create table t1 (d1 date);
Table created.
SQL> insert into t1 values(sysdate);
1 row created.
SQL> select * from t1 where d1 = to_date(d1);
no rows selected
SQL> alter session set nls_date_format = 'dd-mon-yyyy hh24:mi:ss';
Session altered.
SQL> select * from t1 where d1 = to_date(d1);
D1
--------------------
22-apr-2020 15:12:36
1 row selected.
请特别注意,更改nls_date_format会如何更改查询结果!(在引用的博客注释中还有另一个简单的例子。)
这是怎么回事呢?to_date(date_col)等价于to_date(to_char(date_col)),它使用nls_date_format来完成这两个转换,最常见的默认格式是将date列截断为date-only。因此,在许多系统to_date(date_col)几乎与trunc(date_col)一样(尽管这样做的代价很高)。
但让我们更进一步,看看我们如何破坏索引带来的好处,即使我们已经确保我们仍然得到正确的结果。以下是一个最低限度的通用计费需求模型:货币之间的转换:
rem
rem Script: date_to_date.sql
rem Author: Jonathan Lewis
rem Dated: Apr 2020
rem
create table t1
as
with generator as (
select
rownum id
from dual
connect by
level <= 1e4 -- > comment to avoid WordPress format issue
)
select
rownum id,
'GBP' from_currency,
'USD' to_currency,
trunc(sysdate - 1000) + rownum conversion_date,
'Corporate' conversion_type,
round(
1.25 + dbms_random.value/4,
6
) conversion_rate,
lpad(rownum,10,'0') v1,
lpad('x',100,'x') padding
from
generator v1
where
rownum <= 1000 -- > comment to avoid WordPress format issue
;
create unique index t1_i1 on t1(from_currency, to_currency, conversion_date, conversion_type)
/
alter table t1 add constraint t1_pk primary key(
from_currency, to_currency, conversion_date, conversion_type
)
/
create table driver(
invoice_currency varchar2(3),
billing_currency varchar2(3),
client_type varchar2(10),
invoice_date date
);
insert into driver values(
'GBP', 'USD','Corporate',trunc(sysdate)
);
commit;
execute dbms_stats.gather_table_stats(null,'driver');
我创建了表t1来模拟以“今天”结束的大约三年时间内美元和英镑之间的汇率。在这个表的唯一键中还有一个conversion_type列,它允许我们有多个原因进行交换,允许在同一天使用多个汇率。我确实应该在这个表上有一个类似于check (conversion_date = trunc(conversion_date))的检查约束。
我还创建了一个“driver”表,其中包含的数据可能正是提取单个发票的汇率所需的数据。因此,让我们运行SQL来获得这张发票的适当汇率:
set serveroutput off
alter session set statistics_level = all;
prompt =====================================
prompt First run with simple date comparison
prompt =====================================
select
/*+
leading(driver t1)
use_nl_with_index(t1)
*/
driver.*,
t1.conversion_rate
from
driver, t1
where
t1.from_currency = driver.invoice_currency
and t1.to_currency = driver.billing_currency
and t1.conversion_type = driver.client_type
and t1.conversion_date = driver.invoice_date
;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
prompt =================================
prompt Now run with to_date(date_column)
prompt =================================
select
/*+
leading(driver t1)
use_nl_with_index(t1)
*/
driver.*,
t1.conversion_rate
from
driver, t1
where
t1.from_currency = driver.invoice_currency
and t1.to_currency = driver.billing_currency
and t1.conversion_type = driver.client_type
and to_date(t1.conversion_date) = to_date(driver.invoice_date)
;
select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
在第一种情况中,我在这些表之间使用了正确的连接谓词;在第二步中,我在谓词的两端放入了一个冗余的to_date()函数调用。(如果您认为这是不现实的—它与我在上面引用的博客注释中报告的生产代码完全匹配)。
以下是两种执行计划——以及它们的执行消耗统计数据:
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 10 | 8 |
| 1 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 10 | 8 |
| 2 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 9 | 8 |
| 3 | TABLE ACCESS FULL | DRIVER | 1 | 1 | 1 |00:00:00.01 | 7 | 0 |
|* 4 | INDEX UNIQUE SCAN | T1_I1 | 1 | 1 | 1 |00:00:00.01 | 2 | 8 |
| 5 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 1 | 0 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."FROM_CURRENCY"="DRIVER"."INVOICE_CURRENCY" AND
"T1"."TO_CURRENCY"="DRIVER"."BILLING_CURRENCY" AND "T1"."CONVERSION_DATE"="DRIVER"."INVOICE_DATE"
AND "T1"."CONVERSION_TYPE"="DRIVER"."CLIENT_TYPE")
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 15 | 4 |
| 1 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 15 | 4 |
| 2 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.01 | 14 | 4 |
| 3 | TABLE ACCESS FULL | DRIVER | 1 | 1 | 1 |00:00:00.01 | 7 | 0 |
|* 4 | INDEX RANGE SCAN | T1_I1 | 1 | 1 | 1 |00:00:00.01 | 7 | 4 |
| 5 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 1 | 0 |
----------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("T1"."FROM_CURRENCY"="DRIVER"."INVOICE_CURRENCY" AND
"T1"."TO_CURRENCY"="DRIVER"."BILLING_CURRENCY" AND "T1"."CONVERSION_TYPE"="DRIVER"."CLIENT_TYPE")
filter((TO_DATE(INTERNAL_FUNCTION("T1"."CONVERSION_DATE"))=TO_DATE(INTERNAL_FUNCTION("DRIVE
R"."INVOICE_DATE")) AND "T1"."CONVERSION_TYPE"="DRIVER"."CLIENT_TYPE"))
有三件事需要注意:
1.操作4的index unique scan 已更改为index range scan。
2.操作4的谓词信息已经从一个纯access谓词更改为一个access谓词加上一个非常混乱的filter谓词
3.现在在操作4中出现的范围扫描获得7个buffer(一个根块和6个叶块)来找到我们需要的一个rowid,并将对所有1,000个索引条目应用凌乱的过滤谓词,其中两个货币代码为GBP/USD。
在上一篇文章中,有问题的查询几乎把所有时间都花在了像这样的连接上,执行59次逻辑读操作,为提交的每个发票找到一个rowid——可能在执行过程中进行了数千次比较。




