timestamp/date类型字段,谓词越界后,估算值有一定的容忍度
在目标列上没有直方图时,用目标列做范围查询的可选择率计算公式
1)VAL 大于 HIGH_VALUE,并且 VAL 小于 (2HIGH_VALUE-LOW_VALUE):
Selectivity = (1/NUM_DISTINCT)NULL_Adjust(1-((VAL–HIGH_VALUE)/(HIGH_VALUE–LOW_VALUE)))
2)VAL 小于 LOW_VALUE,并且 VAL 大于 (2LOW_VALUE-HIGH_VALUE):
Selectivity = (1/NUM_DISTINCT)NULL_Adjust(1-((LOW_VALUE–VAL)/(HIGH_VALUE–LOW_VALUE)))
–查看表情况
SQL> set linesize 300 pagesize 1000
SQL> select count(*),count(distinct s.createdate) from BS_SYSTEMLOG s;

select count(*) from BS_SYSTEMLOG s where s.createdate is null;

SQL> COLUMN low_value FORMAT A19
SQL> COLUMN high_value FORMAT A19
SQL> alter session set NLS_DATE_FORMAT=‘YYYY-MM-DD HH24:MI:SS’;
SQL> select low_value,high_value,num_distinct,num_nulls,LAST_ANALYZED from dba_tab_col_statistics where table_name=‘BS_SYSTEMLOG’ and column_name=‘CREATEDATE’;

–查询日期最大/最小转换值
set serveroutput on
DECLARE
rv RAW(32) := ‘7878081E131F10’;
dt DATE := NULL;
BEGIN
dbms_stats.convert_raw_value(rv, dt);
dbms_output.put_line('HIGH_DATE: '||TO_CHAR(dt, ‘YYYY-MM-DD hh24:mi:ss’));
END;
/
HIGH_DATE: 2020-08-30 18:30:15
set serveroutput on
DECLARE
rv RAW(32) := ‘78740113161720’;
dt DATE := NULL;
BEGIN
dbms_stats.convert_raw_value(rv, dt);
dbms_output.put_line('HIGH_DATE: '||TO_CHAR(dt, ‘YYYY-MM-DD hh24:mi:ss’));
END;
/
HIGH_DATE: 2016-01-19 21:22:31
SQL> alter session set NLS_DATE_FORMAT=‘YYYY-MM-DD HH24:MI:SS’;
SQL> select t.TABLE_NAME,t.NUM_ROWS,t.BLOCKS,t.LAST_ANALYZED from user_tables t where table_name=‘BS_SYSTEMLOG’;

–综上所述
NUM_ROWS:229754 #以user_tables查到为准
NUM_NULLS:0
NUM_DISTINCT:71200
HIGH_VALUE:1598783415
LOW_VALUE:1453209751
HIGH_VALUE 和 LOW_VALUE 都是根据上面匿名块查到的日期转换成秒的值:
https://tool.lu/timestamp/

容忍度范围为:
(2HIGH_VALUE-LOW_VALUE)=(21598783415-1453209751)=1744357079=2025-04-11 15:37:59 2020-08-30 18:30:15<VAL<2025-04-11 15:37:59
(2LOW_VALUE-HIGH_VALUE)=(21453209751-1598783415)=1307636087=2011-06-10 00:14:47 2011-06-10 00:14:47<VAL<2016-01-19 21:22:31
SQL> set autotrace traceonly
SQL> select count(*) from BS_SYSTEMLOG where CREATEDATE>to_date(‘2020-10-01’,‘yyyy-mm-dd’);

Rows = NUM_ROWS * Selectivity
= NUM_ROWS * (1 / NUM_DISTINCT) * NULL_Adjust * (1 - ((VAL – HIGH_VALUE) / (HIGH_VALUE – LOW_VALUE)))
= NUM_ROWS * (1 / NUM_DISTINCT) * ((NUM_ROWS – NUM_NULLS) / NUM_ROWS) * (1 - ((VAL – HIGH_VALUE) / (HIGH_VALUE – LOW_VALUE)))
= round(229754*(1/71200)1(1-(1601481600-1598783415)/(1598783415-1453209751)))
= 3
SQL> select count(*) from BS_SYSTEMLOG where CREATEDATE<to_date(‘2013-04-12’,‘yyyy-mm-dd’);

Rows = NUM_ROWS * Selectivity
= NUM_ROWS * (1 / NUM_DISTINCT) * NULL_Adjust * (1 - ((VAL – HIGH_VALUE) / (HIGH_VALUE – LOW_VALUE)))
= NUM_ROWS * (1 / NUM_DISTINCT) * ((NUM_ROWS – NUM_NULLS) / NUM_ROWS) * (1 - ((LOW_VALUE – VAL) / (HIGH_VALUE – LOW_VALUE)))
= round(229754*(1/71200)1(1-(1453209751-1365696000)/(1598783415-1453209751)))
= 1
上面是把date/timestamp字段,统计信息里的最大值和最小值转换成秒进行的计算。如果有更便捷,或不当之处,请指正。




