暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

oracle 索引集群因子的学习

原创 小张同学 2023-07-14
302

集群因子 cluster_factor:用于判断索引回表所需要消耗的物理IO次数。

 select owner,index_name,clustering_factor from dba_indexes where owner='ZWY' and index_name='IND_ID';

OWNER                                                        INDEX_NAME                                                   CLUSTERING_FACTOR
------------------------------------------------------------ ------------------------------------------------------------ -----------------
ZWY                                                          IND_ID                                                                    1361

select * from (select object_id,rowid from zwy_0714 where object_id is not null order by object_id) where rownum<=10;
 OBJECT_ID ROWID
---------- ------------------
         2 AAAWqYAAEAAAHEbAAw
         3 AAAWqYAAEAAAHEbAAF
         4 AAAWqYAAEAAAHEbAAx
         5 AAAWqYAAEAAAHEbAAa
         6 AAAWqYAAEAAAHEbAAV
         7 AAAWqYAAEAAAHEbAAR
         8 AAAWqYAAEAAAHEbAAk
         9 AAAWqYAAEAAAHEbAAN
        10 AAAWqYAAEAAAHEbAAl
        11 AAAWqYAAEAAAHEbAA4

对于数据中的每一行,rowid伪列返回行的地址。rowid值主要包含以下信息:

对象的数据对象编号
该行所在的数据文件中的数据块
该行中数据块的位置(第一行是0)
数据行所在的数据文件(第一个文件是1)。该文件编号是相对于表空间。

AAAWqY       AAE     AAAHEb       AAw
数据对象编号   数据块   数据块的位置   数据文件编号

索引的叶子块中有序地存储了索引键值和键值所在的rowid;
clustering_factor的算法:
比较索引键如object_id所在rowid是否在同一个数据块,如果在,则clsutering_factor不增加,反之则自增1,可以看到以上11行都在AAE的同一个数据块中,所以clustering_factor不增加,依次一直比较到最后索引键最后一个键值。

1.索引的集群因子介于表的数据块block以及表的行数之间
2.索引的集群因子越小,回表所需要访问的数据块越少,反之越多

假设根据索引扫描返回了1000行数据,根据rowid来进行回表访问
第一种情况:1000行数据都在同一个block中,对应集群因子小,那么回表消耗仅为1个IO,因为仅需读取一个block
第二种情况:1000行数据都各自分布在不同的block中,对应集群因子大,那么回表需要1000个IO,因为需要读取1000个block

索引扫描类型:
1.索引范围扫描(index range scan) – 受集群因子影响,因为需要回表
2.索引全扫描 (index full scan) – 受集群因子影响,因为需要回表
3.索引唯一扫描 (index unique scan) – 不受集群因子影响,因为只需要回表1次
4.索引快扫 (index fast full scan) – 不受集群因子影响,因为不需要回表

解决方案:
1、重建索引(无效,没有改变索引键和表数据对应的顺序关系)
2、根据索引列排序来重建表 create table new_table as select * from old_table order by index_column_name(不现实,生产上大表往往不止一个索引,小表通常不需要建立索引)
3、减少回表次数,如创建组合索引避免或者减少回表

案例:

SQL>  select owner,index_name,clustering_factor from dba_indexes where owner='ZWY' and index_name='IND_ID';

OWNER                                                        INDEX_NAME                                                   CLUSTERING_FACTOR
------------------------------------------------------------ ------------------------------------------------------------ -----------------
ZWY                                                          IND_ID                                                                    1361


 SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) blocks from zwy.zwy_0714;

    BLOCKS
----------
      1249

可以看到以上ind_id索引的clustering_factor为1361与表的block数很接近

SQL> set arraysize 500
SQL> set arraysize 5000
SQL> set autot trace
SQL> select * from zwy_0714 where object_id<1000;

997 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1726413974

----------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |   938 | 91924 |    19   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ZWY_0714 |   938 | 91924 |    19   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | IND_ID   |   938 |       |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"<1000)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         21  consistent gets 
          0  physical reads
          0  redo size
      92748  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        997  rows processed



SQL> create table zwy_0714_2 as select * from zwy_0714 order by dbms_random.value;

Table created.

SQL> create index ind_id2 on zwy_0714_2(object_id);

Index created.

select owner,index_name,clustering_factor from dba_indexes where owner='ZWY' and index_name='IND_ID2';
SQL> select owner,index_name,clustering_factor from dba_indexes where owner='ZWY' and index_name='IND_ID2';

OWNER                                                        INDEX_NAME                                                   CLUSTERING_FACTOR
------------------------------------------------------------ ------------------------------------------------------------ -----------------
ZWY                                                          IND_ID2                                                                  87220
 select count(distinct dbms_rowid.rowid_block_number(rowid)) blocks from zwy.zwy_0714_2;
    BLOCKS
----------
      1249

 SQL> select count(*) from zwy_0714_2;

  COUNT(*)
----------
     87276


可以看到以上ind_id索引的clustering_factor为1361与表的行数很接近

select /*+index(zwy_0714_2) */ * from zwy_0714_2 where object_id<1000; --因为集群因子很大,cbo由此计算走索引的成本很高,所以需要指定hint
Execution Plan
----------------------------------------------------------
Plan hash value: 2953394467

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |   997 |   201K|   961   (0)| 00:00:12 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ZWY_0714_2 |   997 |   201K|   961   (0)| 00:00:12 |
|*  2 |   INDEX RANGE SCAN          | IND_ID2    |   997 |       |     4   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("OBJECT_ID"<1000)

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1002  consistent gets
          0  physical reads
          0  redo size
      92748  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        997  rows processed

再看看索引快扫以及唯一扫描的情况下,集群因子有没有影响

SQL> select count(object_id) from zwy_0714_2;


Execution Plan
----------------------------------------------------------
Plan hash value: 1266430892

---------------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     1 |    13 |    58   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE       |         |     1 |    13 |            |          |
|   2 |   INDEX FAST FULL SCAN| IND_ID2 | 90870 |  1153K|    58   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        201  consistent gets
          0  physical reads
          0  redo size
        536  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed



可以看到,只访问1次表或者不回表的情况下,集群因子对执行计划的选择无影响,执行计划选择走索引。

最后修改时间:2023-07-14 18:03:59
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论