概述
某些场景下,OB 分区表分区表多分区扫描COST估算错误,导致生成低效执行计划,经过各种猜测并模拟测试,找到了原因,本文通过模拟重现并分析原因,为类似场景有个警示,避免发生严重的性能问题。
版本:V3.2.4 ORACLE租户
起因
在生产库上一条SQL的条件选择性很好,也有索引,统计信息也没问题,确走了全表扫描
select * from test.TESTTABX_HISTORY t where region=111 and TTTNUM=‘xxxxxxxxxxxxx’;
test.TESTTABX_HISTORY partition by range(“REGION”,“INTIME”)
表很大,TTTNUM列选择性很好,也有索引(第一列),为什么不走这个索引,走全表扫?
obclient [SYS]> explain extended
-> select * from TEST.TESTTABX_history t where region=101 and TTTNUM='xxxxxxxxxxxxx'\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR | |1 |50 |
|1 | EXCHANGE OUT DISTR |:EX10000|1 |46 |
|2 | PX PARTITION ITERATOR| |1 |46 |
|3 | TABLE SCAN |T |1 |46 |
====================================================
Outputs & filters:
-------------------------------------
......
3 - ......,
is_index_back=false, filter_before_indexback[false],
range_key([T.REGION(0x7f83d0b33f40)], [T.INTIME(0x7f83d0b34240)], [T.__pk_increment(0x7f83d0bae870)]), range(101,MIN,MIN ; 101,MAX,MAX),
range_cond([T.REGION(0x7f83d0b33f40) = 101(0x7f83d0b35f20)])
索引信息
+-------------+------------------+-------+--------------------------------+------------+-------------+-----------------------+
| TABLE_OWNER | TABLE_NAME | OWNER | INDEX_NAME | UNIQUENESS | PARTITIONED | COLUMN_NAME |
+-------------+------------------+-------+--------------------------------+------------+-------------+-----------------------+
| TEST | TESTTABX_HISTORY | TEST | INX_TESTTABX_HIS_TTTNUM | NONUNIQUE | YES | TTTNUM,REGION |
+-------------+------------------+-------+--------------------------------+------------+-------------+-----------------------+
进一步分区发现全表扫的COST很低,而且使用REGION=‘111’,有的一个值,COST很低。
select * from test.testtabx_history t where region=111\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR | |1 |50 |
|1 | EXCHANGE OUT DISTR |:EX10000|1 |46 |
|2 | PX PARTITION ITERATOR| |1 |46 |
|3 | TABLE SCAN |T |1 |46 |
====================================================
Outputs & filters:
-------------------------------------
0 - .....
1 - .....
2 - .....
force partition granule, asc.
3 - .....,
is_index_back=false,
range_key([T.REGION(0x7fa7677f3740)], [T.INTIME(0x7fa7677f3a40)], [T.__pk_increment(0x7fa767867bd0)]), range(111,MIN,MIN ; 111,MAX,MAX),
range_cond([T.REGION(0x7fa7677f3740) = 101(0x7fa7677f5720)])
模拟重现
- 测试表准备
drop table t11;
create table t11(region number,id1 VARCHAR2(30),c2 number,c3 VARCHAR2(30))
partition by range(region,C2)
(
partition p10_1 values less than(10,500000),
partition p10_2 values less than(10,1000000),
partition p20_1 values less than(20,500000),
partition p20_2 values less than(20,1000000),
partition p30_1 values less than(30,500000),
partition p30_2 values less than(30,1000000)
);
insert into t11 select 10,rownum,rownum,rownum from dual connect by rownum<=300000;
insert into t11 select 20,rownum,rownum,rownum from dual connect by rownum<=400000;
insert into t11 select 30,rownum,rownum,rownum from dual connect by rownum<=300000;
create index idx_t11_c1 on t11(id1,c2) local;
构造完数据库,发起一次合并,合并会收集统计信息。
alter system major freeze;
select * from __all_zone where name like '%merge%';
-
索引信息与统计信息
[root@ocp ~]# eoba ind sys.t11 Display indexes where table or index name matches sys.t11 .... +-------------+------------+------------+------+-------------+------+ | TABLE_OWNER | TABLE_NAME | INDEX_NAME | POS# | COLUMN_NAME | DSC | +-------------+------------+------------+------+-------------+------+ | SYS | T11 | IDX_T11_C1 | 1 | ID1 | NULL | | SYS | T11 | IDX_T11_C1 | 2 | C2 | NULL | +-------------+------------+------------+------+-------------+------+ +-------------+------------+------------+---------+------+--------+------+--------+------------+ | INDEX_OWNER | TABLE_NAME | INDEX_NAME | IDXTYPE | UNIQ | STATUS | PART | DEGREE | VISIBILITY | +-------------+------------+------------+---------+------+--------+------+--------+------------+ | SYS | T11 | IDX_T11_C1 | NORMAL | NO | VALID | YES | 1 | VISIBLE | +-------------+------------+------------+---------+------+--------+------+--------+------------+ ---表分区统计信息 select a.table_id, a.partition_id, b.part_name, a.svr_ip, a.row_count, a.data_size from __all_virtual_meta_table a, __all_virtual_part b where a.table_id = (select table_id from __all_virtual_table where table_name = 'T11') and role = 1 and a.table_id = b.table_id and a.partition_id = b.part_id; +------------------+--------------+-----------+---------------+-----------+-----------+ | table_id | partition_id | part_name | svr_ip | row_count | data_size | +------------------+--------------+-----------+---------------+-----------+-----------+ | 1100611139454041 | 0 | P10_1 | 133.96.123.36 | 300000 | 3872803 | | 1100611139454041 | 1 | P10_2 | 133.96.123.36 | 0 | 0 | | 1100611139454041 | 2 | P20_1 | 133.96.123.36 | 400000 | 5196464 | | 1100611139454041 | 3 | P20_2 | 133.96.123.36 | 0 | 0 | | 1100611139454041 | 4 | P30_1 | 133.96.123.36 | 300000 | 3878592 | | 1100611139454041 | 5 | P30_2 | 133.96.123.36 | 0 | 0 | +------------------+--------------+-----------+---------------+-----------+-----------+ 6 rows in set (0.038 sec) MySQL [oceanbase]> ---列统计信息 select /*+LEADING(((c b) a) t) use_nl(b) use_nl(a) use_nl(t) */ b.table_name, t.partition_id, a.column_name, num_distinct, t.num_null, t.min_value, t.max_value from __all_virtual_column_statistic t, __all_virtual_column a, __all_virtual_table b, __all_virtual_database c where t.tenant_id = a.tenant_id and t.table_id = a.table_id and t.column_id = a.column_id and a.tenant_id = b.tenant_id and a.table_id = b.table_id and b.tenant_id = c.tenant_id and b.database_id = c.database_id and c.database_name = 'SYS' and b.table_name = 'T11' and a.column_name in ('REGION','ID1'); +------------+--------------+-------------+--------------+----------+------------------------------+------------------------------+ | table_name | partition_id | column_name | num_distinct | num_null | min_value | max_value | +------------+--------------+-------------+--------------+----------+------------------------------+------------------------------+ | T11 | 0 | REGION | 1 | 0 | 0F053FFFC00000010A000000 | 0F053FFFC00000010A000000 | | T11 | 1 | REGION | 0 | 0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 | | T11 | 2 | REGION | 1 | 0 | 0F053FFFC000000114000000 | 0F053FFFC000000114000000 | | T11 | 3 | REGION | 0 | 0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 | | T11 | 4 | REGION | 1 | 0 | 0F053FFFC00000011E000000 | 0F053FFFC00000011E000000 | | T11 | 5 | REGION | 0 | 0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 | | T11 | 0 | ID1 | 312549 | 0 | 16022EFF013100 | 16022EFF05393939393900 | | T11 | 1 | ID1 | 0 | 0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 | | T11 | 2 | ID1 | 400848 | 0 | 16022EFF013100 | 16022EFF05393939393900 | | T11 | 3 | ID1 | 0 | 0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 | | T11 | 4 | ID1 | 312549 | 0 | 16022EFF013100 | 16022EFF05393939393900 | | T11 | 5 | ID1 | 0 | 0 | 19070000FDFFFFFFFFFFFFFFFF01 | 19070000FEFFFFFFFFFFFFFFFF01 | +------------+--------------+-------------+--------------+----------+------------------------------+------------------------------+ 12 rows in set (0.279 sec) MySQL [oceanbase]> -
观察对比执行计划,可以看到会走全表描,COST很低
测试SQL
plan1 explain extended select * from t11 where region=10 and id1=‘10’\G
plan2 explain extended select * from t11 where region=10\G
plan3 explain extended select * from t11 where region=20\G
plan1
obclient [SYS]> explain extended select * from t11 where region=10 and id1='10'\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR | |1 |46 |
|1 | EXCHANGE OUT DISTR |:EX10000|1 |46 |
|2 | PX PARTITION ITERATOR| |1 |46 |
|3 | TABLE SCAN |T11 |1 |46 |
====================================================
Outputs & filters:
-------------------------------------
0 - output([INTERNAL_FUNCTION(T11.REGION(0x7fa00e65dc60), T11.ID1(0x7fa00e65fd90), T11.C2(0x7fa00e65df60), T11.C3(0x7fa00e660190))(0x7fa00e6cced0)]), filter(nil)
1 - output([INTERNAL_FUNCTION(T11.REGION(0x7fa00e65dc60), T11.ID1(0x7fa00e65fd90), T11.C2(0x7fa00e65df60), T11.C3(0x7fa00e660190))(0x7fa00e6cced0)]), filter(nil), dop=1
2 - output([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.ID1(0x7fa00e65fd90)], [T11.C3(0x7fa00e660190)]), filter(nil),
force partition granule, asc.
3 - output([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.ID1(0x7fa00e65fd90)], [T11.C3(0x7fa00e660190)]), filter([T11.ID1(0x7fa00e65fd90) = ?(0x7fa00e65f670)]),
access([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.ID1(0x7fa00e65fd90)], [T11.C3(0x7fa00e660190)]), partitions(p[0-2]),
is_index_back=false, filter_before_indexback[false],
range_key([T11.REGION(0x7fa00e65dc60)], [T11.C2(0x7fa00e65df60)], [T11.__pk_increment(0x7fa00e6b3530)]), range(10,MIN,MIN ; 10,MAX,MAX),
range_cond([T11.REGION(0x7fa00e65dc60) = 10(0x7fa00e65e880)])
Used Hint:
-------------------------------------
/*+
*/
Outline Data:
-------------------------------------
/*+
BEGIN_OUTLINE_DATA
FULL(@"SEL$1" "SYS.T11"@"SEL$1")
END_OUTLINE_DATA
*/
Plan Type:
-------------------------------------
DISTRIBUTED
Optimization Info:
-------------------------------------
T11:table_rows:700000, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:0, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[IDX_T11_C1,T11], estimation info[table_id:1100611139454039, (table_type:1, version:0-1713750141509457-1713750141509457, logical_rc:0, physical_rc:0), (table_type:7, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:5, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:0, version:1713750158466912-1713750158466912-9223372036854775807, logical_rc:0, physical_rc:0)]
Parameters:
-------------------------------------
{obj:{"VARCHAR2":"10", collation:"utf8mb4_bin", coercibility:"INVALID"}, accuracy:{length:-1, precision:1, scale:-1}, flag:0, raw_text_pos:-1, raw_text_len:-1, param_meta:{type:"VARCHAR2", collation:"utf8mb4_bin", coercibility:"INVALID"}}
1 row in set (0.014 sec)
plan2
obclient [SYS]> explain extended select * from t11 where region=10\G
*************************** 1. row ***************************
Query Plan: ====================================================
|ID|OPERATOR |NAME |EST. ROWS|COST|
----------------------------------------------------
|0 |PX COORDINATOR | |1 |47 |
|1 | EXCHANGE OUT DISTR |:EX10000|1 |46 |
|2 | PX PARTITION ITERATOR| |1 |46 |
|3 | TABLE SCAN |T11 |1 |46 |
====================================================
Outputs & filters:
-------------------------------------
0 - output([INTERNAL_FUNCTION(T11.REGION(0x7f7451c5d520), T11.ID1(0x7f7451c5e960), T11.C2(0x7f7451c5d820), T11.C3(0x7f7451c5ec60))(0x7f7451cc1530)]), filter(nil)
1 - output([INTERNAL_FUNCTION(T11.REGION(0x7f7451c5d520), T11.ID1(0x7f7451c5e960), T11.C2(0x7f7451c5d820), T11.C3(0x7f7451c5ec60))(0x7f7451cc1530)]), filter(nil), dop=1
2 - output([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.ID1(0x7f7451c5e960)], [T11.C3(0x7f7451c5ec60)]), filter(nil),
force partition granule, asc.
3 - output([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.ID1(0x7f7451c5e960)], [T11.C3(0x7f7451c5ec60)]), filter(nil),
access([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.ID1(0x7f7451c5e960)], [T11.C3(0x7f7451c5ec60)]), partitions(p[0-2]),
is_index_back=false,
range_key([T11.REGION(0x7f7451c5d520)], [T11.C2(0x7f7451c5d820)], [T11.__pk_increment(0x7f7451cac920)]), range(10,MIN,MIN ; 10,MAX,MAX),
range_cond([T11.REGION(0x7f7451c5d520) = 10(0x7f7451c5e140)])
Used Hint:
-------------------------------------
/*+
*/
Outline Data:
-------------------------------------
/*+
BEGIN_OUTLINE_DATA
FULL(@"SEL$1" "SYS.T11"@"SEL$1")
END_OUTLINE_DATA
*/
Plan Type:
-------------------------------------
DISTRIBUTED
Optimization Info:
-------------------------------------
T11:table_rows:700000, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:1, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[T11], pruned_index_name[IDX_T11_C1], estimation info[table_id:1100611139454039, (table_type:1, version:0-1713750141509457-1713750141509457, logical_rc:0, physical_rc:0), (table_type:7, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:5, version:1713750131657425-1713750131657425-1713750158466912, logical_rc:0, physical_rc:0), (table_type:0, version:1713750158466912-1713750158466912-9223372036854775807, logical_rc:0, physical_rc:0)]
Parameters:
-------------------------------------
1 row in set (0.004 sec)
plan3
obclient [SYS]> explain extended select * from t11 where region=20\G
*************************** 1. row ***************************
Query Plan: ======================================================
|ID|OPERATOR |NAME |EST. ROWS|COST |
------------------------------------------------------
|0 |PX COORDINATOR | |1193226 |972282|
|1 | EXCHANGE OUT DISTR |:EX10000|1193226 |461546|
|2 | PX PARTITION ITERATOR| |1193226 |461546|
|3 | TABLE SCAN |T11 |1193226 |461546|
======================================================
Outputs & filters:
-------------------------------------
0 - output([INTERNAL_FUNCTION(T11.REGION(0x7f88f8ddd520), T11.ID1(0x7f88f8dde960), T11.C2(0x7f88f8ddd820), T11.C3(0x7f88f8ddec60))(0x7f88f8e41550)]), filter(nil)
1 - output([INTERNAL_FUNCTION(T11.REGION(0x7f88f8ddd520), T11.ID1(0x7f88f8dde960), T11.C2(0x7f88f8ddd820), T11.C3(0x7f88f8ddec60))(0x7f88f8e41550)]), filter(nil), dop=1
2 - output([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.ID1(0x7f88f8dde960)], [T11.C3(0x7f88f8ddec60)]), filter(nil),
force partition granule, asc.
3 - output([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.ID1(0x7f88f8dde960)], [T11.C3(0x7f88f8ddec60)]), filter(nil),
access([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.ID1(0x7f88f8dde960)], [T11.C3(0x7f88f8ddec60)]), partitions(p[2-4]),
is_index_back=false,
range_key([T11.REGION(0x7f88f8ddd520)], [T11.C2(0x7f88f8ddd820)], [T11.__pk_increment(0x7f88f8e2c940)]), range(20,MIN,MIN ; 20,MAX,MAX),
range_cond([T11.REGION(0x7f88f8ddd520) = 20(0x7f88f8dde140)])
Used Hint:
-------------------------------------
/*+
*/
Outline Data:
-------------------------------------
/*+
BEGIN_OUTLINE_DATA
FULL(@"SEL$1" "SYS.T11"@"SEL$1")
END_OUTLINE_DATA
*/
Plan Type:
-------------------------------------
DISTRIBUTED
Optimization Info:
-------------------------------------
T11:table_rows:700000, physical_range_rows:1193226, logical_range_rows:1193226, index_back_rows:0, output_rows:1193226, est_method:local_storage, optimization_method=cost_based, avaiable_index_name[T11], pruned_index_name[IDX_T11_C1], estimation info[table_id:1100611139454041, (table_type:1, version:0-1713751671270815-1713751671270815, logical_rc:397742, physical_rc:397742), (table_type:7, version:1713751661441605-1713751661441605-1713751690070397, logical_rc:0, physical_rc:0), (table_type:5, version:1713751661441605-1713751661441605-1713751690070397, logical_rc:0, physical_rc:0), (table_type:0, version:1713751690070397-1713751690070397-9223372036854775807, logical_rc:0, physical_rc:0)]
Parameters:
-------------------------------------
1 row in set (0.011 sec)
原因分析
- 生成的执行计划估行对比
| 条件 | 估行 | 分区裁剪选择的分区 |
|---|---|---|
| region=10 and id1=‘10’ | 1 | p[0-2] |
| region=10 | 1 | p[0-2] |
| region=20 | 1193226 | p[2-4] |
partition p10_1 values less than(10,500000),
partition p10_2 values less than(10,1000000),
partition p20_1 values less than(20,500000),
partition p20_2 values less than(20,1000000),
partition p30_1 values less than(30,500000),
partition p30_2 values less than(30,1000000)
+------------------+--------------+-----------+---------------+-----------+-----------+
| table_id | partition_id | part_name | svr_ip | row_count | data_size |
+------------------+--------------+-----------+---------------+-----------+-----------+
| 1100611139454041 | 0 | P10_1 | 133.96.123.36 | 300000 | 3872803 |
| 1100611139454041 | 1 | P10_2 | 133.96.123.36 | 0 | 0 |
| 1100611139454041 | 2 | P20_1 | 133.96.123.36 | 400000 | 5196464 |
| 1100611139454041 | 3 | P20_2 | 133.96.123.36 | 0 | 0 |
| 1100611139454041 | 4 | P30_1 | 133.96.123.36 | 300000 | 3878592 |
| 1100611139454041 | 5 | P30_2 | 133.96.123.36 | 0 | 0 |
+------------------+--------------+-----------+---------------+-----------+-----------+
- OB优化器估行规则
- OceanBase 数据库的优化器估行时依赖的统计信息可以从以下 3 个地方获取,其优先级为:存储层估行 > 直方图统计信息(V3.2 版本手工搜集)> 基础统计信息(合并时搜集)
- 存储层对表估行估出的物理行数,优化器算 cost 参考的行数,当多分区表全表扫描时,优化器会根据数据量最大的分区的行数*分区数来估算。
- 只有当谓词条件是索引首列时,存储层才能进行估行
-
通过以上规则,基本可以搞清楚原因
- OB的表实际为一种特殊的索引,与ORACLE索引组织表,MYSQL的表类似。没有指定主键的分区表,索引键值为分区键+__pk_increment,这个示例中的表,键值为
REGION,C2,__pk_increment
- OB的表实际为一种特殊的索引,与ORACLE索引组织表,MYSQL的表类似。没有指定主键的分区表,索引键值为分区键+__pk_increment,这个示例中的表,键值为
2) region=10 分区裁剪为 p[0-2],为多个分区,会进行存储层估行,p[0-2]最大的分区为 P2(P20_1 values less than(20,500000)),由于条件为REGION=10,但P20_1 这个分区中没有REGION=10的数据,所以估行为1.
3) region=20 分区裁剪为 p[2-4],为多个分区,会进行存储层估行,p[0-2]最大的分区为 P2(P20_1 values less than(20,500000)),P20_1中的数据REGION全为20,则这个分区的估计约为400000,扫描3个分区,则估行为400000*3=1200000,与OB的估行为1193226,非常接近了,则于优化器的算法复杂,未公开,这些差异不好确定,但总体差异不大。
4) region=10分区裁剪为什么是p[0-2],而实际是P[0-1],P2 (P20_1 values less than(20,500000))为什么也在选择列表
由于这个分区键为多列,但查询条件只有REGION=10,所以要满足这个范围 range(10,MIN; 10,MAX),p10_2< 10,MAX<p20_1,所以p20_1也在读取范围。
总结
多个分区键的分区表,如果查询条件只有分区键的第一个列时,可能会导致使用错误的分区进行估行,生成性能不好的执行计划?
对于使用前导列为等值的查询,在分区裁剪时是否可以排除掉,至少在代价估算的时侯不使用这个分区进行估行?
OB是否会考虑全局统计信息




