JL Computer Consultancy
Index Efficiency |
Aug 2004 |
There are endless arguments on the internet about how often you should rebuild your indexes, and how much benefit you might get from doing index so. I don’t plan to address the question of how you work out the cost/risk/benefit equation for your site, but you may be interested in the following script because it does far better than anything else on the internet at identifying indexes that might (note the emphasis) benefit from a rebuild.
The script works only for version 9i and above, and depends on the undocumented function sys_op_lbid() which can be used to acquire the information in index leaf blocks directly.
In this case, we start by scanning an index segment as efficiently as possible and working out how many entries there are in each block. By using an in-line view we take this intermediate result one step further and re-aggregate the result to produce a report showing how many blocks have “N” entries.
As an example of the output, consider the following:
ROWS_PER_BLOCK BLOCKS
-------------- ----------
24 1
66 1
68 4
69 5
70 4
71 5
72 4
73 5
74 4
75 4
76 5
77 5
78 2
79 3
80 2
82 1
-------------- ----------
Sum 55
In this example, most of the leaf blocks have about 75 rows, plus or minus a few, and there is just one extreme block (which we can safely assume is the last block of the index) that is half empty. If you have recent statistics for the index, you can compare the total number of blocks in the report with the statistic leaf_blocks from the user_indexes and can check (roughly) how many blocks below the high water mark are empty - some of the difference will be due to branch blocks.
Note: if you don’t have recent statistics, then the dbms_space package gives you a cheap way of finding out how many blocks there are below the high water mark.
Update Oct 2007:
A
recent discussion on the comp.databases.oracle.server newsgroup drew my attention
to this page, and reading it made me realize there is an error in the comment
about recent statistics and leaf_block counts.
Here are a couple of possible variations on the content of the report that may give you some hints about the need to rebuild or, perhaps more appropriately, coalesce, the index.
1) If you can see that a large number of blocks have a very small number of rows_per_block, whereas a small number of blocks have a much larger number of rows_per_block, this may be an example of a degenerate index being used to handle a FIFO (first in, first out) queue – it could be a candidate for a coalesce.
2) If you see that a relatively small number of blocks have far more rows – in the order of double – than the rest of the index, it is possible that you are suffering from an index that is splitting badly and not refilling in the middle. It is possible that a rebuild may pack your index into half the space and not lead to lots of splitting immediately afterwards. (It may be that within hours of the rebuild, every block has split again, in which case the rebuild was probably a mistake).
In any case, you should probably take a little time to think about indexes where the pattern is NOT “mostly the same usage but a few blocks with significantly less usage”. Of course, there are always exceptions – the last bit of advice will miss the extreme case where “most of the index has gone bad, but bits of it are a disaster”, but I’ll tell you how to spot that case in another article.
Update April 2009:
For a practical example of using this code, there is a case-study on my blog; and there’s a friendlier version of the code, on my blog here.
rem
rem Script: index_efficiency.sql
rem Author: Jonathan Lewis
rem Dated: Sept 2003
rem Purpose: Example
of how to check leaf block packing
rem
rem Notes
rem Last tested 9.2.0.4
rem
rem Example of analyzing index entries
per leaf block.
rem The code examines index T1_I1 on table T1.
rem
rem The index is on (v1, small_pad).
Both columns appear
rem the where clause with a not null test to avoid issues
rem relating to indexes with completely nullable entries.
rem
rem For a simple b-tree index, the first parameter to the
rem sys_op_lbid() function has to be the object_id of the
rem index.
rem
rem The query will work with a sample clause
rem
rem Check that the execution path is an index fast full scan
rem
column ind_id new_value m_ind_id
select
object_id ind_id
from
user_objects
where
object_name = 'T1_I1'
;
break on report skip 1
compute sum of blocks on report
select
rows_per_block,
count(*) blocks
from (
select
/*+
cursor_sharing_exact
dynamic_sampling(0)
no_monitoring
no_expand
index_ffs(t1,t1_i1)
noparallel_index(t,t1_i1)
*/
sys_op_lbid( &m_ind_id ,'L',t1.rowid) as block_id,
count(*) as rows_per_block
from
t1
-- t1 sample block (100)
where
v1 is not null
or small_pad is not null
group by
sys_op_lbid( &m_ind_id ,'L',t1.rowid)
)
group by rows_per_block
order by rows_per_block
;




