暂无图片
暂无图片
2
暂无图片
暂无图片
暂无图片

OB 分区表多分区扫描COST估算错误问题

原创 范计杰 2024-06-04
498

概述

某些场景下,OB 分区表分区表多分区扫描COST估算错误,导致生成低效执行计划,经过各种猜测并模拟测试,找到了原因,本文通过模拟重现并分析原因,为类似场景有个警示,避免发生严重的性能问题。

版本:V3.2.4 ORACLE租户

起因

在生产库上一条SQL的条件选择性很好,也有索引,统计信息也没问题,确走了全表扫描

select * from test.TESTTABX_HISTORY t where region=111 and TTTNUM=‘xxxxxxxxxxxxx’;

test.TESTTABX_HISTORY partition by range(“REGION”,“INTIME”)

表很大,TTTNUM列选择性很好,也有索引(第一列),为什么不走这个索引,走全表扫?


obclient [SYS]> explain extended
    -> select * from TEST.TESTTABX_history t where region=101 and TTTNUM='xxxxxxxxxxxxx'\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR               |NAME    |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR         |        |1        |50  |
|1 | EXCHANGE OUT DISTR    |:EX10000|1        |46  |
|2 |  PX PARTITION ITERATOR|        |1        |46  |
|3 |   TABLE SCAN          |T       |1        |46  |
====================================================

Outputs & filters: 
-------------------------------------
 ......
  3 - ......, 
      is_index_back=false, filter_before_indexback[false], 
      range_key([T.REGION(0x7f83d0b33f40)], [T.INTIME(0x7f83d0b34240)], [T.__pk_increment(0x7f83d0bae870)]), range(101,MIN,MIN ; 101,MAX,MAX), 
      range_cond([T.REGION(0x7f83d0b33f40) = 101(0x7f83d0b35f20)])

索引信息
+-------------+------------------+-------+--------------------------------+------------+-------------+-----------------------+
| TABLE_OWNER | TABLE_NAME       | OWNER | INDEX_NAME                     | UNIQUENESS | PARTITIONED | COLUMN_NAME           |
+-------------+------------------+-------+--------------------------------+------------+-------------+-----------------------+
| TEST        | TESTTABX_HISTORY | TEST  | INX_TESTTABX_HIS_TTTNUM        | NONUNIQUE  | YES         | TTTNUM,REGION         |
+-------------+------------------+-------+--------------------------------+------------+-------------+-----------------------+

进一步分区发现全表扫的COST很低,而且使用REGION=‘111’,有的一个值,COST很低。

select * from test.testtabx_history t where region=111\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR               |NAME    |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR         |        |1        |50  |
|1 | EXCHANGE OUT DISTR    |:EX10000|1        |46  |
|2 |  PX PARTITION ITERATOR|        |1        |46  |
|3 |   TABLE SCAN          |T       |1        |46  |
====================================================

Outputs & filters: 
-------------------------------------
  0 - .....
  1 - .....
  2 - .....
      force partition granule, asc.
  3 - ....., 
      is_index_back=false, 
      range_key([T.REGION(0x7fa7677f3740)], [T.INTIME(0x7fa7677f3a40)], [T.__pk_increment(0x7fa767867bd0)]), range(111,MIN,MIN ; 111,MAX,MAX), 
      range_cond([T.REGION(0x7fa7677f3740) = 101(0x7fa7677f5720)])

模拟重现

  1. 测试表准备
drop table  t11;
create table t11(region number,id1 VARCHAR2(30),c2 number,c3 VARCHAR2(30))
partition by range(region,C2)
(
partition p10_1 values less than(10,500000),
partition p10_2 values less than(10,1000000),
partition p20_1 values less than(20,500000),
partition p20_2 values less than(20,1000000),
partition p30_1 values less than(30,500000),
partition p30_2 values less than(30,1000000)
);
 
insert into t11 select 10,rownum,rownum,rownum from dual connect by rownum<=300000;
insert into t11 select 20,rownum,rownum,rownum from dual connect by rownum<=400000;
insert into t11 select 30,rownum,rownum,rownum from dual connect by rownum<=300000;

create index idx_t11_c1 on t11(id1,c2) local;

构造完数据库,发起一次合并,合并会收集统计信息。
alter system major freeze;
select * from __all_zone where name like '%merge%';




  1. 索引信息与统计信息

    [root@ocp ~]# eoba ind sys.t11
    Display indexes where table or index name matches sys.t11 ....
    +-------------+------------+------------+------+-------------+------+
    | TABLE_OWNER | TABLE_NAME | INDEX_NAME | POS# | COLUMN_NAME | DSC  |
    +-------------+------------+------------+------+-------------+------+
    | SYS         | T11        | IDX_T11_C1 |    1 | ID1         | NULL |
    | SYS         | T11        | IDX_T11_C1 |    2 | C2          | NULL |
    +-------------+------------+------------+------+-------------+------+
    +-------------+------------+------------+---------+------+--------+------+--------+------------+
    | INDEX_OWNER | TABLE_NAME | INDEX_NAME | IDXTYPE | UNIQ | STATUS | PART | DEGREE | VISIBILITY |
    +-------------+------------+------------+---------+------+--------+------+--------+------------+
    | SYS         | T11        | IDX_T11_C1 | NORMAL  | NO   | VALID  | YES  | 1      | VISIBLE    |
    +-------------+------------+------------+---------+------+--------+------+--------+------------+
    
    
    ---表分区统计信息
    select a.table_id,
           a.partition_id,
           b.part_name,
           a.svr_ip,
           a.row_count,
           a.data_size
      from __all_virtual_meta_table a, __all_virtual_part b
     where a.table_id = (select table_id
                           from __all_virtual_table
                          where table_name = 'T11')
       and role = 1
       and a.table_id = b.table_id
       and a.partition_id = b.part_id;
       
    +------------------+--------------+-----------+---------------+-----------+-----------+
    | table_id         | partition_id | part_name | svr_ip        | row_count | data_size |
    +------------------+--------------+-----------+---------------+-----------+-----------+
    | 1100611139454041 |            0 | P10_1     | 133.96.123.36 |    300000 |   3872803 |
    | 1100611139454041 |            1 | P10_2     | 133.96.123.36 |         0 |         0 |
    | 1100611139454041 |            2 | P20_1     | 133.96.123.36 |    400000 |   5196464 |
    | 1100611139454041 |            3 | P20_2     | 133.96.123.36 |         0 |         0 |
    | 1100611139454041 |            4 | P30_1     | 133.96.123.36 |    300000 |   3878592 |
    | 1100611139454041 |            5 | P30_2     | 133.96.123.36 |         0 |         0 |
    +------------------+--------------+-----------+---------------+-----------+-----------+
    6 rows in set (0.038 sec)
    
    MySQL [oceanbase]>
    
    
    
    
    ---列统计信息
    select /*+LEADING(((c b) a) t) use_nl(b) use_nl(a) use_nl(t) */
     b.table_name,
     t.partition_id,
     a.column_name,
     num_distinct,
     t.num_null,
     t.min_value,
     t.max_value
      from __all_virtual_column_statistic t,
           __all_virtual_column         a,
           __all_virtual_table          b,
           __all_virtual_database       c
     where t.tenant_id = a.tenant_id
       and t.table_id = a.table_id
       and t.column_id = a.column_id
       and a.tenant_id = b.tenant_id
       and a.table_id = b.table_id
       and b.tenant_id = c.tenant_id
       and b.database_id = c.database_id
       and c.database_name = 'SYS'
       and b.table_name = 'T11'
       and a.column_name in ('REGION','ID1');
    
    +------------+--------------+-------------+--------------+----------+------------------------------+------------------------------+
    | table_name | partition_id | column_name | num_distinct | num_null | min_value                    | max_value                    |
    +------------+--------------+-------------+--------------+----------+------------------------------+------------------------------+
    | T11        |            0 | REGION      |            1 |        0 | 0F053FFFC00000010A000000     | 0F053FFFC00000010A000000     |
    | T11        |            1 | REGION      |            0 |        0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 |
    | T11        |            2 | REGION      |            1 |        0 | 0F053FFFC000000114000000     | 0F053FFFC000000114000000     |
    | T11        |            3 | REGION      |            0 |        0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 |
    | T11        |            4 | REGION      |            1 |        0 | 0F053FFFC00000011E000000     | 0F053FFFC00000011E000000     |
    | T11        |            5 | REGION      |            0 |        0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 |
    | T11        |            0 | ID1         |       312549 |        0 | 16022EFF013100               | 16022EFF05393939393900       |
    | T11        |            1 | ID1         |            0 |        0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 |
    | T11        |            2 | ID1         |       400848 |        0 | 16022EFF013100               | 16022EFF05393939393900       |
    | T11        |            3 | ID1         |            0 |        0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 |
    | T11        |            4 | ID1         |       312549 |        0 | 16022EFF013100               | 16022EFF05393939393900       |
    | T11        |            5 | ID1         |            0 |        0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 |
    +------------+--------------+-------------+--------------+----------+------------------------------+------------------------------+
    12 rows in set (0.279 sec)
    
    MySQL [oceanbase]>
    
    
  2. 观察对比执行计划,可以看到会走全表描,COST很低

    测试SQL

    plan1 explain extended select * from t11 where region=10 and id1=‘10’\G

    plan2 explain extended select * from t11 where region=10\G

    plan3 explain extended select * from t11 where region=20\G

    plan1

obclient [SYS]> explain extended select * from t11 where region=10 and id1='10'\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR               |NAME    |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR         |        |1        |46  |
|1 | EXCHANGE OUT DISTR    |:EX10000|1        |46  |
|2 |  PX PARTITION ITERATOR|        |1        |46  |
|3 |   TABLE SCAN          |T11     |1        |46  |
====================================================

Outputs & filters:
-------------------------------------
  0 - output([INTERNAL_FUNCTION(T11.REGION(0x7fa00e65dc60), T11.ID1(0x7fa00e65fd90), T11.C2(0x7fa00e65df60), T11.C3(0x7fa00e660190))(0x7fa00e6cced0)]), filter(nil)
  1 - output([INTERNAL_FUNCTION(T11.REGION(0x7fa00e65dc60), T11.ID1(0x7fa00e65fd90), T11.C2(0x7fa00e65df60), T11.C3(0x7fa00e660190))(0x7fa00e6cced0)]), filter(nil), dop=1
  2 - output([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.ID1(0x7fa00e65fd90)], [T11.C3(0x7fa00e660190)]), filter(nil),
      force partition granule, asc.
  3 - output([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.ID1(0x7fa00e65fd90)], [T11.C3(0x7fa00e660190)]), filter([T11.ID1(0x7fa00e65fd90) = ?(0x7fa00e65f670)]),
      access([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.ID1(0x7fa00e65fd90)], [T11.C3(0x7fa00e660190)]), partitions(p[0-2]),
      is_index_back=false, filter_before_indexback[false],
      range_key([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.__pk_increment(0x7fa00e6b3530)]), range(10,MIN,MIN ; 10,MAX,MAX),
      range_cond([T11.REGION(0x7fa00e65dc60) = 10(0x7fa00e65e880)])

Used Hint:
-------------------------------------
  /*+
  */

Outline Data:
-------------------------------------
  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "SYS.T11"@"SEL$1")
      END_OUTLINE_DATA
  */

Plan Type:
-------------------------------------
DISTRIBUTED

Optimization Info:
-------------------------------------
T11:table_rows:700000, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:0, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[IDX_T11_C1,T11], estimation info[table_id:1100611139454039, (table_type:1, version:0-1713750141509457-1713750141509457, logical_rc:0, physical_rc:0), (table_type:7, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:5, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:0, version:1713750158466912-1713750158466912-9223372036854775807, logical_rc:0, physical_rc:0)]


Parameters:
-------------------------------------
{obj:{"VARCHAR2":"10", collation:"utf8mb4_bin", coercibility:"INVALID"}, accuracy:{length:-1, precision:1, scale:-1}, flag:0, raw_text_pos:-1, raw_text_len:-1, param_meta:{type:"VARCHAR2", collation:"utf8mb4_bin", coercibility:"INVALID"}}

1 row in set (0.014 sec)



plan2

obclient [SYS]> explain extended select * from t11 where region=10\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR               |NAME    |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR         |        |1        |47  |
|1 | EXCHANGE OUT DISTR    |:EX10000|1        |46  |
|2 |  PX PARTITION ITERATOR|        |1        |46  |
|3 |   TABLE SCAN          |T11     |1        |46  |
====================================================

Outputs & filters:
-------------------------------------
  0 - output([INTERNAL_FUNCTION(T11.REGION(0x7f7451c5d520), T11.ID1(0x7f7451c5e960), T11.C2(0x7f7451c5d820), T11.C3(0x7f7451c5ec60))(0x7f7451cc1530)]), filter(nil)
  1 - output([INTERNAL_FUNCTION(T11.REGION(0x7f7451c5d520), T11.ID1(0x7f7451c5e960), T11.C2(0x7f7451c5d820), T11.C3(0x7f7451c5ec60))(0x7f7451cc1530)]), filter(nil), dop=1
  2 - output([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.ID1(0x7f7451c5e960)], [T11.C3(0x7f7451c5ec60)]), filter(nil),
      force partition granule, asc.
  3 - output([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.ID1(0x7f7451c5e960)], [T11.C3(0x7f7451c5ec60)]), filter(nil),
      access([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.ID1(0x7f7451c5e960)], [T11.C3(0x7f7451c5ec60)]), partitions(p[0-2]),
      is_index_back=false,
      range_key([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.__pk_increment(0x7f7451cac920)]), range(10,MIN,MIN ; 10,MAX,MAX),
      range_cond([T11.REGION(0x7f7451c5d520) = 10(0x7f7451c5e140)])

Used Hint:
-------------------------------------
  /*+
  */

Outline Data:
-------------------------------------
  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "SYS.T11"@"SEL$1")
      END_OUTLINE_DATA
  */

Plan Type:
-------------------------------------
DISTRIBUTED

Optimization Info:
-------------------------------------
T11:table_rows:700000, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:1, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[T11], pruned_index_name[IDX_T11_C1], estimation info[table_id:1100611139454039, (table_type:1, version:0-1713750141509457-1713750141509457, logical_rc:0, physical_rc:0), (table_type:7, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:5, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:0, version:1713750158466912-1713750158466912-9223372036854775807, logical_rc:0, physical_rc:0)]


Parameters:
-------------------------------------


1 row in set (0.004 sec)

plan3

obclient [SYS]> explain extended select * from t11 where region=20\G
*************************** 1. row ***************************
Query Plan: ======================================================
|ID|OPERATOR               |NAME    |EST. ROWS|COST  |
------------------------------------------------------
|0 |PX COORDINATOR         |        |1193226  |972282|
|1 | EXCHANGE OUT DISTR    |:EX10000|1193226  |461546|
|2 |  PX PARTITION ITERATOR|        |1193226  |461546|
|3 |   TABLE SCAN          |T11     |1193226  |461546|
======================================================

Outputs & filters:
-------------------------------------
  0 - output([INTERNAL_FUNCTION(T11.REGION(0x7f88f8ddd520), T11.ID1(0x7f88f8dde960), T11.C2(0x7f88f8ddd820), T11.C3(0x7f88f8ddec60))(0x7f88f8e41550)]), filter(nil)
  1 - output([INTERNAL_FUNCTION(T11.REGION(0x7f88f8ddd520), T11.ID1(0x7f88f8dde960), T11.C2(0x7f88f8ddd820), T11.C3(0x7f88f8ddec60))(0x7f88f8e41550)]), filter(nil), dop=1
  2 - output([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.ID1(0x7f88f8dde960)], [T11.C3(0x7f88f8ddec60)]), filter(nil),
      force partition granule, asc.
  3 - output([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.ID1(0x7f88f8dde960)], [T11.C3(0x7f88f8ddec60)]), filter(nil),
      access([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.ID1(0x7f88f8dde960)], [T11.C3(0x7f88f8ddec60)]), partitions(p[2-4]),
      is_index_back=false,
      range_key([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.__pk_increment(0x7f88f8e2c940)]), range(20,MIN,MIN ; 20,MAX,MAX),
      range_cond([T11.REGION(0x7f88f8ddd520) = 20(0x7f88f8dde140)])

Used Hint:
-------------------------------------
  /*+
  */

Outline Data:
-------------------------------------
  /*+
      BEGIN_OUTLINE_DATA
      FULL(@"SEL$1" "SYS.T11"@"SEL$1")
      END_OUTLINE_DATA
  */

Plan Type:
-------------------------------------
DISTRIBUTED

Optimization Info:
-------------------------------------
T11:table_rows:700000, physical_range_rows:1193226, logical_range_rows:1193226, index_back_rows:0, output_rows:1193226, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[T11], pruned_index_name[IDX_T11_C1], estimation info[table_id:1100611139454041, (table_type:1, version:0-1713751671270815-1713751671270815, logical_rc:397742, physical_rc:397742), (table_type:7, version:1713751661441605-1713751661441605-1713751690070397, logical_rc:0, physical_rc:0), (table_type:5, version:1713751661441605-1713751661441605-1713751690070397, logical_rc:0, physical_rc:0), (table_type:0, version:1713751690070397-1713751690070397-9223372036854775807, logical_rc:0, physical_rc:0)]


Parameters:
-------------------------------------


1 row in set (0.011 sec)

原因分析

  1. 生成的执行计划估行对比
条件 估行 分区裁剪选择的分区
region=10 and id1=‘10’ 1 p[0-2]
region=10 1 p[0-2]
region=20 1193226 p[2-4]
partition p10_1 values less than(10,500000),
partition p10_2 values less than(10,1000000),
partition p20_1 values less than(20,500000),
partition p20_2 values less than(20,1000000),
partition p30_1 values less than(30,500000),
partition p30_2 values less than(30,1000000)

+------------------+--------------+-----------+---------------+-----------+-----------+
| table_id         | partition_id | part_name | svr_ip        | row_count | data_size |
+------------------+--------------+-----------+---------------+-----------+-----------+
| 1100611139454041 |            0 | P10_1     | 133.96.123.36 |    300000 |   3872803 |
| 1100611139454041 |            1 | P10_2     | 133.96.123.36 |         0 |         0 |
| 1100611139454041 |            2 | P20_1     | 133.96.123.36 |    400000 |   5196464 |
| 1100611139454041 |            3 | P20_2     | 133.96.123.36 |         0 |         0 |
| 1100611139454041 |            4 | P30_1     | 133.96.123.36 |    300000 |   3878592 |
| 1100611139454041 |            5 | P30_2     | 133.96.123.36 |         0 |         0 |
+------------------+--------------+-----------+---------------+-----------+-----------+
  1. OB优化器估行规则
  • OceanBase 数据库的优化器估行时依赖的统计信息可以从以下 3 个地方获取,其优先级为:存储层估行 > 直方图统计信息(V3.2 版本手工搜集)> 基础统计信息(合并时搜集)
  • 存储层对表估行估出的物理行数,优化器算 cost 参考的行数,当多分区表全表扫描时,优化器会根据数据量最大的分区的行数*分区数来估算。
  • 只有当谓词条件是索引首列时,存储层才能进行估行
  1. 通过以上规则,基本可以搞清楚原因

    1. OB的表实际为一种特殊的索引,与ORACLE索引组织表,MYSQL的表类似。没有指定主键的分区表,索引键值为分区键+__pk_increment,这个示例中的表,键值为 REGION,C2,__pk_increment

​ 2) region=10 分区裁剪为 p[0-2],为多个分区,会进行存储层估行,p[0-2]最大的分区为 P2(P20_1 values less than(20,500000)),由于条件为REGION=10,但P20_1 这个分区中没有REGION=10的数据,所以估行为1.

​ 3) region=20 分区裁剪为 p[2-4],为多个分区,会进行存储层估行,p[0-2]最大的分区为 P2(P20_1 values less than(20,500000)),P20_1中的数据REGION全为20,则这个分区的估计约为400000,扫描3个分区,则估行为400000*3=1200000,与OB的估行为1193226,非常接近了,则于优化器的算法复杂,未公开,这些差异不好确定,但总体差异不大。

​ 4) region=10分区裁剪为什么是p[0-2],而实际是P[0-1],P2 (P20_1 values less than(20,500000))为什么也在选择列表

​ 由于这个分区键为多列,但查询条件只有REGION=10,所以要满足这个范围 range(10,MIN; 10,MAX),p10_2< 10,MAX<p20_1,所以p20_1也在读取范围。

总结

多个分区键的分区表,如果查询条件只有分区键的第一个列时,可能会导致使用错误的分区进行估行,生成性能不好的执行计划?

对于使用前导列为等值的查询,在分区裁剪时是否可以排除掉,至少在代价估算的时侯不使用这个分区进行估行?

OB是否会考虑全局统计信息

最后修改时间:2024-06-04 20:08:43
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论