在索引测验1和索引测验2中,我演示了ITL(interested transaction list 事务槽)在索引中的一些行为细节。在本文中,我将解释这些细节是如何导致索引空间的严重浪费的。
我在之前的贴子中指出两点:
1、在高并发时,你会”浪费“索引叶子块中的大量空间给ITL。
2、当一个叶子块分裂时,新的块上的ITL会从老块中拷贝ITL。
观点2的一个后果是你只需要一个短暂的活动突发,使一个或两个ITL增长到一个非通常的大小,在此之后,从那时起,在索引的该部分中拆分的每个叶块,都将拥有一个大的ITL,即使并不需要它。
结合这一点,我认为这可能是一个bug——但出于并发性的原因,它实际上可能是被故意设计的一个特性(有一个意外的副作用)——并且会出现意外。
在索引测验1中,我向你展示了有如下统计情况的索引,并且指出这浪费了大约50%的可用空间。由于它的ITL已经增长到169个条目的最大值(对于8KB块)–并且指出这是高开发的一个副作用。实现上,我在运行这个测试样例时,我只是在2CPU的机器上跑了8个并行进程。
以下是一些代码,如果你想在你的系统上测试这个样例的话。
rem
rem Script: order_create.sql
rem Author: Jonathan Lewis
rem Dated: June 2009
rem
create sequence s1 cache 250000;
create table orders (
date_placed date not null,
order_id number(8),
time_stamp timestamp,
id_owner number(2),
padding varchar2(168)
)
tablespace test_8k_assm
initrans 8
;
create index ord_placed on orders(date_placed)
tablespace test_8k
initrans 9
;
表空间test_8k_assm是一个使用8KB的块,统一extent大小为1MB,自动段空间管理(ASSM)的表空间。表空间test_8k也是类似的,但是使用了freelist的手动段空间管理。我还在表上使用了initrans 8,在索引上使用了initrans 9.这是因为我测试8个并行进程(表上的设置可能是多余的,因为我使用的是ASSM,并且在测试中使用的并发会话少于16个)。
在创建了序列,表和索引后,您可以从任意多个不同的会话中运行以下PL/SQL(如果需要,可以调整循环的限制)
rem
rem Script: order_entry.sql
rem Author: Jonathan Lewis
rem Dated: June 2009
rem
declare
m_ord number(8) := 0;
begin
while m_ord <= 100000 loop
insert into orders (
date_placed, order_id, padding
)
values (
sysdate - 1000 + s1.nextval/1000,
s1.nextval,
rpad('x',168)
)
returning order_id into m_ord
;
commit write immediate wait; -- 10g feature
end loop;
end;
/
此代码生成的模型是一个订单输入系统,在过去1000天内每天有1000个订单。但数据输入的速度,当然,加速到了极端。
如果只运行一个PL/SQL的拷贝并且在之后验证索引,你会得到如下类似结果(在10.2.0.3上运行)
HEIGHT : 2
BLOCKS : 256
LF_ROWS : 100001
LF_BLKS : 238
LF_ROWS_LEN : 1900019
LF_BLK_LEN : 8000
BR_ROWS : 237
BR_BLKS : 1
BR_ROWS_LEN : 3312
BR_BLK_LEN : 8032
BTREE_SPACE : 1912032
USED_SPACE : 1903331
PCT_USED : 100
索引空间被完美的100%使用–这是数据插入顺序性的自然结果;尽管我们指定了(但Oracle“忽略”了initrans 9),但我们可以看到lf_blk_len是8000,这意味着每个叶块在其ITL中仍然至少有两个条目。
如果你运行脚本的两个拷贝,你可以看到类似如下的结果:
HEIGHT : 2
BLOCKS : 384
LF_ROWS : 100002
LF_BLKS : 255
LF_ROWS_LEN : 1900038
LF_BLK_LEN : 7976
BR_ROWS : 254
BR_BLKS : 1
BR_ROWS_LEN : 3544
BR_BLK_LEN : 8032
BTREE_SPACE : 2041912
USED_SPACE : 1903582
PCT_USED : 94
只是”浪费“了很少的几个百分点的空间,并且lf_blk_len显示一些块已经获取到了ITL上的第三个条目–因此要从8000字节上减去24字节,得到7976.
如果你运行了类似我的“index efficiency”(链接地址)中的代码来检查索引中的块的使用情况,你可以看到类似如下的结果:
ROWS_PER_BLOCK BLOCKS TOTAL_ROWS RUNNING_TOTAL
-------------- ---------- ---------- -------------
206 12 2472 12
208 1 208 13
209 1 209 14
211 2 422 16
212 1 212 17
214 15 3210 32
260 1 260 33
410 1 410 34
419 221 92599 255
---------- ----------
sum 255 100002
在“90/10叶节点分裂”之后,大多数块都已满(约420个条目),但有少数块是50/50分裂(约210个条目)。50/50分裂是在具有多个cpu的多用户系统上得到的结果。有时会话会得到它的序列号,然后被OS占用,从而允许另一个会话获得并插入一个更高的值:如果运气不好,这将发生在叶块填满时。
重要说明:我用来生成上述结果的笔记本有2颗CPU–如果你尝试在有一颗CPU的机器上运行测试,那么并发测试也许会得到明显不同的结果。
现在重复用更多的PL/SQL拷贝,随着并发会话数量的增加,空间需求将不断攀升,lf_blk_len最终将下降到3992,叶块(lf_blk)的数量可能是串行执行中的两到三倍(最坏情况下可能是四倍,但只有当每一个块都有一半的空间被ITL所占用,并进行50/50的分裂时,这种情况才会发生。
如果您有N个CPU(N>1),那么我希望看到问题开始出现在N+1和2N个并发会话之间,但是如果您想缩短测试时间,只需进行4N个会话,然后看看会发生什么。如果在启动之前CPU已经负载很重,那么问题将在并发会话较少的情况下不也会发生(我在以最大速度进行全机病毒扫描的同时仅运行几个会话,得到了一些显著的结果)。
测试结果是不确定的–结果将取决于Oracle的版本、重做日志缓冲区的大小、重做日志文件的大小、I/O子系统的速度、CPU的数量、操作系统、你连接会话用到的进程ID(这是ASSM的副作用),以及当时发生的任何工作量。但归根结底是这样的–如果索引中存在受大量并发DML影响的热点,那么您可能会发现自己在索引的那个区域浪费了不必要的空间。
在本系列的下一部分中,我将对所发生的事情发表一些评论,并讨论如何解决这个问题。但在我完成之前,如果你设计了一个怪异的案例,你可以让索引统计看起来有多奇怪–它一定是一个非常有效的索引,我已经使用了174%的可用空间!
HEIGHT : 2
BLOCKS : 384
NAME : ORD_PLACED
LF_ROWS : 110008
LF_BLKS : 301
LF_ROWS_LEN : 2090152
LF_BLK_LEN : 3992
BR_ROWS : 300
BR_BLKS : 1
BR_ROWS_LEN : 4191
BR_BLK_LEN : 8032
DISTINCT_KEYS : 110008
MOST_REPEATED_KEY : 1
BTREE_SPACE : 1209624
USED_SPACE : 2094343
PCT_USED : 174
ROWS_PER_KEY : 1
BLKS_GETS_PER_ACCESS : 3
2018年9月更新
我刚刚在一个有4个CPU的(虚拟)机器上的18.3.0.0的库上重新运行了测试代码,运行了4个并发插入语句。
当我插入6000个订单时,最后几个索引叶块上的ITL计数(itc值)已经跃升到39,因此在这方面似乎还有一些改进的空间。我试着提高并发度和测试的持续时间–但到目前为止,我还没有超过39个ITL条目(其他一些人多年来在12c上运行这个测试,并报告了41个ITL条目)–所以可能有一个非常容易达到的合理的硬限制(从块中使用大约1KB)。在某个地方,我确信我有一个关于如何解决的帖子(或者在网上找到了一个帖子),但我目前找不到了。有一个参数“_index_split_chk_cancel”(在12.2中引入,默认值为5)看起来似乎是与此相关的,但它似乎对问题没有影响
原文链接地址和内容如下:
https://jonathanlewis.wordpress.com/2009/07/28/index-explosion/
Index Explosion
Filed under: Index Explosion,Indexing,Infrastructure,Oracle,Performance,Troubleshooting — Jonathan Lewis @ 7:28 pm BST Jul 28,2009
In Index Quiz 1 and Index Quiz 2 I demonstrated a couple of details of how the ITL (interested transaction list) behaves in indexes. In this note I’m going to explain how these details can result in a nasty waste of space in indexes.
The two points I made in the previous posts were:
at high levels of concurrency you can “lose” a lot of space in an index leaf block to the ITL
when a leaf block splits the ITL of the newer block is a copy of the ITL from the older block
One consequence of point 2 is that you need only have one brief burst of activity that makes one or two ITLs grow to an uncharacteristic size, after which every leaf block that splits off in that portion of the index from then onwards will have a large ITL even if it doesn’t need it.
Combine this with something that I think is probably a bug – but which may actually have been deliberately designed in as a feature (with an unexpected side effect) for concurrency reasons – and surprises appear.
In Index Quiz 1 I showed you an index with the following statistics, and pointed out that it had lost about 50% of it’s available space because its ITLs had grown to the maximum (for an 8KB block) of 169 entries – and pointed out that this can be a side effect of very high concurrency. In fact, when I ran the test case, I was running just eight concurrent processes on a machine with two CPUs.
Here’s some code if you want to try the test on your own systems:
rem
rem Script: order_create.sql
rem Author: Jonathan Lewis
rem Dated: June 2009
rem
create sequence s1 cache 250000;
create table orders (
date_placed date not null,
order_id number(8),
time_stamp timestamp,
id_owner number(2),
padding varchar2(168)
)
tablespace test_8k_assm
initrans 8
;
create index ord_placed on orders(date_placed)
tablespace test_8k
initrans 9
;
The tablespace test_8k_assm is using 8KB blocks, 1MB uniform extent sizes, and automatic segment space management (ASSM). The tablespace test_8k is similar but uses freelist (manual) segment space management. I’ve used initrans 8 on the table and initrans 9 on the index because I tested eight concurrent processes (the setting on the table was probably redundant given I was using ASSM and used less than 16 concurrent sessions in the test).
After creating the sequence, table, and index , you can run the following PL/SQL from as many different sessions as you like (adjusting the limit on the loop if necessary):
rem
rem Script: order_entry.sql
rem Author: Jonathan Lewis
rem Dated: June 2009
rem
declare
m_ord number(8) := 0;
begin
while m_ord <= 100000 loop
insert into orders (
date_placed, order_id, padding
)
values (
sysdate - 1000 + s1.nextval/1000,
s1.nextval,
rpad('x',168)
)
returning order_id into m_ord
;
commit write immediate wait; -- 10g feature
end loop;
end;
/
The model that this code is generating is an order entry system with 1,000 orders per day over the last 1,000 days. But the rate of data entry is, of course, accelerated to extremes.
If you run just one copy of the PL/SQL and validate the index afterwards you should get results like this (running 10.2.0.3):
HEIGHT : 2
BLOCKS : 256
LF_ROWS : 100001
LF_BLKS : 238
LF_ROWS_LEN : 1900019
LF_BLK_LEN : 8000
BR_ROWS : 237
BR_BLKS : 1
BR_ROWS_LEN : 3312
BR_BLK_LEN : 8032
BTREE_SPACE : 1912032
USED_SPACE : 1903331
PCT_USED : 100
The index space is perfectly (100%) used – it’s the natural consequence of the sequential nature of the data inserts; and although we specified (and Oracle “ignored” initrans 9) we can see that the lf_blk_len is 8,000 which means every leaf block still has the minimum two entries in its ITL.
If you run two copies of the script, you may see results more like this:
HEIGHT : 2
BLOCKS : 384
LF_ROWS : 100002
LF_BLKS : 255
LF_ROWS_LEN : 1900038
LF_BLK_LEN : 7976
BR_ROWS : 254
BR_BLKS : 1
BR_ROWS_LEN : 3544
BR_BLK_LEN : 8032
BTREE_SPACE : 2041912
USED_SPACE : 1903582
PCT_USED : 94
There’s just a few percent “lost” space, and the lf_blk_len is showing that some blocks have acquired a third entry in the ITL – hence the 24 byte drop from 8,000 to 7,976.
If you run something like my “index efficiency” code to check how well each block in the index is used, you may find something like this:
ROWS_PER_BLOCK BLOCKS TOTAL_ROWS RUNNING_TOTAL
-------------- ---------- ---------- -------------
206 12 2472 12
208 1 208 13
209 1 209 14
211 2 422 16
212 1 212 17
214 15 3210 32
260 1 260 33
410 1 410 34
419 221 92599 255
---------- ----------
sum 255 100002
Most blocks are full (ca. 420 entries), after a “90/10 leaf node split”, but a few have done a 50/50 split (ca. 210 entries). The 50/50 splits are what you get on a multi-user system with multiple CPUs. Occasionally a session will get its sequence number then get pre-empted by the operating system, allowing another session to get and insert a higher value: with a little bad luck this will happen just as a leaf block fills.
Important Note: The laptop that I used to generate these results has 2 CPUs – if you try running this test on a machine with a single CPU then the concurrent test may give dramatically different results.
Now repeat with more copies of the PL/SQL. As the number of concurrent sessions grows the space requirement will climb, and the lf_blk_len will eventually drop to 3,992 and the number of leaf blocks (lf_blks) will be probably be two or three times as large as you got in serial execution (In the worst case it could be four times as large, but this would only happen if every block had lost half its space to ITLs and did a 50/50 split).
If you have N CPUs (N > 1) then I would expect to see the problem starting to appear somewhere between N+1 and 2N concurrent sessions – but if you want to short-cut the testing just go for 4N sessions and see what happens. If the CPU is already heavily loaded before you start then the problem will appear with fewer concurrent sessions (I got some dramatic results by running with just a couple of sessions whilst doing a full machine virus scan at maximum speed).
The test isn’t deterministic – the results will depend on things like the version of Oracle, size of the redo log buffer, the size of the redo log files, the speed of the I/O subsystem, the number of CPUs, the operating system, the process ids of the sessions you happen to connect to (that’s a side effect of ASSM) , and any workload that happens to be going on at the time. But the bottom line is this – if you’ve got hot spots in indexes that are subject to a lot of concurrent DML then you can find yourself wasting space unnecessarily in that area of the index.
In the next installment of the series I’ll make some comments about what I think is happening, and discuss how to address the issue. But before I finish, here’s how odd you can make index_stats look if you engineer a bizarre accident – it must be a really efficient index, I’ve used 174% of the available space !
HEIGHT : 2
BLOCKS : 384
NAME : ORD_PLACED
LF_ROWS : 110008
LF_BLKS : 301
LF_ROWS_LEN : 2090152
LF_BLK_LEN : 3992
BR_ROWS : 300
BR_BLKS : 1
BR_ROWS_LEN : 4191
BR_BLK_LEN : 8032
DISTINCT_KEYS : 110008
MOST_REPEATED_KEY : 1
BTREE_SPACE : 1209624
USED_SPACE : 2094343
PCT_USED : 174
ROWS_PER_KEY : 1
BLKS_GETS_PER_ACCESS : 3
Update Sept 2018
I’ve just re-run the test code on 18.3.0.0 on a (virtual) machine with 4 CPUs, running 4 concurrent insert statements.
By the time I had inserted 6,000 orders the ITL count (itc value) on the last few index leaf blocks had jumped to 39, so there still seems to be some room for improvement in this area. I’ve tried increasing the degree of concurrency and number duration of the test – but so far I’ve not gone above 39 ITL entries (a couple of other people have run this test on 12c over the years and reported 41 ITL entries) – so perhaps there’s a fairly hard limit that can be reached quite easily (using about 1KB from the block). Somewhere I’m sure I had a note (or found a note on the Internet) about how the fix probably worked, but I can’t find it at present. There is a parameter “_index_split_chk_cancel” (introduced in 12.2, default value 5) which looks as if it might be relevant, but it doesn’t seem to have an effect on the problem.




