从Oracle9i开始,Oracle开始通过VSQL_PLAN等视图进行SQL执行计划的记录,通过这个视图,可以获取正在执行中或者仍然缓存着的SQL执行计划,从而可以帮助我们进行实时准确的数据库诊断。在Oracle9i中,可以通过自定义编写的一些脚本来获取SQL的执行计划,如通过HASH_VALUE(可以通过VSESSION或者VSQL、VSQL_PLAN视图获得SQL的HASH_VALUE)输入来获取SQL及其执行计划:
oracle@/opt/oracle/tools$./getplan_by_hashvalue.sh 3870760741
SQL_TEXT
--------------------------------------------------------------------------------
select count(uspl.numusplguid) from hy_usersubplan_log uspl,hy_serviceplan sp,hy_serviceinfo s,hy_spinfo pvd,hy_platform pf where uspl.numsplanguid + 0 = sp.numplanguid and sp.numsvrguid = s.numsvrguid and s.numspguid + 0 = pvd.numspguid and pvd.numptguid+ 0 = pf.numptguid and pf.vc2platformid = :1 and pvd.vc2spcode = :2 and s.vc2service_id = :3 and s.vc2ispack = :4 and uspl.vc2enabledflag = 'Y' and uspl.datstart <= sysdate and nvl(uspl.datend, sysdate + 1) >= sysdate and uspl.vc2userid = :5
HASH_VALUE EXECUTIONS PER_GETS MODULE
---------- ---------- ------------ ------------------------------------
3870760741 30458 78.4 JDBC Thin Client
--------------------------------------------------------------------------------
| Operation | PHV/Object Name | Rows | Bytes| Cost |
--------------------------------------------------------------------------------
|SELECT STATEMENT |----- 3870760741 ----| | | 28 |
|SORT AGGREGATE | | 1 | 269 | |
| NESTED LOOPS | | 1 | 269 | 28 |
| NESTED LOOPS | | 1 | 204 | 9 |
| NESTED LOOPS | | 1 | 178 | 7 |
| NESTED LOOPS | | 1 | 100 | 6 |
| TABLE ACCESS BY INDEX ROWID|HY_PLATFORM | 1 | 20 | 2 |
| INDEX UNIQUE SCAN |HYUIDX_PLATFORMID | 97 | | 1 |
| TABLE ACCESS FULL |HY_SERVICEINFO | 1 | 80 | 4 |
| TABLE ACCESS BY INDEX ROWID |HY_SPINFO | 1 | 78 | 1 |
| INDEX UNIQUE SCAN |HYPK_SPINFO | 97 | | |
| TABLE ACCESS BY INDEX ROWID |HY_SERVICEPLAN | 2 | 52 | 2 |
| INDEX RANGE SCAN |HYUIDX_SERVICEPLAN | 2 | | 1 |
| PARTITION LIST ALL | | | | |
| TABLE ACCESS BY LOCAL INDEX R|HY_USERSUBPLAN_LOG | 1 | 65 | 19 |
| INDEX RANGE SCAN |HYIDX_USPL_USERID | 72 | | 18 |
--------------------------------------------------------------------------------
该脚本的编码内容如下:
oracle@/opt/oracle/tools$cat getplan_by_hashvalue.sh
#!/bin/ksh
$ORACLE_HOME/bin/sqlplus -s "/ as sysdba"<<EOF
set lines 121
set pages 999
col sql_text format a80
col module format a36
col per_gets format 999999999.9
select sql_text from v\$sqltext_with_newlines where hash_value=$1 order by piece;
select hash_value,executions,buffer_gets/decode(executions,0,1,executions) as per_gets,a.module
from v\$sqlarea a where a.hash_value=$1;
set heading off
select '--------------------------------------------------------------------------------' from dual
union all
select '| Operation | PHV/Object Name | Rows | Bytes| Cost |' as "Optimizer Plan:" from dual
union all
select '--------------------------------------------------------------------------------' from dual
union all
select *
from (select
rpad('|'||substr(lpad(' ',1*(depth-1))||operation||
decode(options, null,'',' '||options), 1, 32), 33, ' ')||'|'||
rpad(decode(id, 0, '----- '||to_char(hash_value)||' -----'
, substr(decode(substr(object_name, 1, 7), 'SYS_LE_', null, object_name)
||' ',1, 20)), 21, ' ')||'|'||
lpad(decode(cardinality,null,' ',
decode(sign(cardinality-1000), -1, cardinality||' ',
decode(sign(cardinality-1000000), -1, trunc(cardinality/1000)||'K',
decode(sign(cardinality-1000000000), -1, trunc(cardinality/1000000)||'M',
trunc(cardinality/1000000000)||'G')))), 7, ' ') || '|' ||
lpad(decode(bytes,null,' ',
decode(sign(bytes-1024), -1, bytes||' ',
decode(sign(bytes-1048576), -1, trunc(bytes/1024)||'K',
decode(sign(bytes-1073741824), -1, trunc(bytes/1048576)||'M',
trunc(bytes/1073741824)||'G')))), 6, ' ') || '|' ||
lpad(decode(cost,null,' ',
decode(sign(cost-10000000), -1, cost||' ',
decode(sign(cost-1000000000), -1, trunc(cost/1000000)||'M',
trunc(cost/1000000000)||'G'))), 8, ' ') || '|' as "Explain plan"
from v\$sql_plan where hash_value = $1
and child_number = (select max(child_number) from v\$sql_plan where hash_value = $1))
union all
select '--------------------------------------------------------------------------------' from dual;
exit
EOF
以下介绍一个实际的诊断案例供参考。
曾经遇到过这样一次性能问题,开发人员编写的一个存储过程,其中包含了一系列的事务处理,大约有10个左右的DML事务执行,每一个SQL在SQL*Plus中执行都很迅速,但是一旦放在过程中执行,通过参数传入一个sysdate,整个过程的执行就变得非常缓慢,无法成功完成。数据库环境为Oralce9iR2:
SQL> select * from v$version where rownum <2;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
收到这个问题首先需要确定哪个SQL是真正引起性能问题的罪魁祸首。首先对Procedure进行一点修改,在每个DML事务执行之前在一张临时创建的测试表中插入一个时间,前后两个时间相减可以得到每个SQL独立执行的时间。通过这个办法,发现在执行到第8个SQL时,响应失去。也就是说,这个SQL是导致性能缓慢的根本原因。
找到这个SQL接下来的事情就变得简单一些,修改这个过程,在Procedure之前增加一个跟踪:
create or replace procedure cmop_servdetail_d_eygle (m_datstat date) Authid Current_User is
begin
execute immediate 'alter session set sql_trace=true';
………………
那么当再次执行这个过程时,SQL的执过程被记录到一个Trace文件中,但是注意,由于SQL可能暂时无法完成,所以执行计划等信息并不会输出,但是由于SQL的HASH过程首先完成,在Trace文件的输出中,首先打印出了SQL的HASH VALUE值,这里是:hv=3740055767
APPNAME mod='SQL*Plus' mh=3669949024 act='' ah=4029777240
=====================
PARSING IN CURSOR #2 len=32 dep=1 uid=28 oct=42 lid=28 tim=12567557125437 hv=3943786303 ad='20e289d8'
alter session set sql_trace=true
END OF STMT
EXEC #2:c=0,e=7735,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=4,tim=12567557124267
=====================
PARSING IN CURSOR #2 len=4721 dep=1 uid=28 oct=2 lid=28 tim=12567557130962 hv=3740055767 ad='1270b8c0'
INSERT into cmo_servdetail_d
(vc2dayguid,
。。。。。。。。
and a.vc2bt = c.vc2cmbt
and a.vc2cid = c.vc2cmcid
and c.numsvrguid = b.numsvrguid
group by b.numsvrguid,
b.vc2service_id,
substr(a.vc2mid, 0, 4),
a.vc2ua,
c.vc2urltype
END OF STMT
得到这个HASH VALUE值之后就可以通过vsql_plan来获得这个SQL的执行计划,另外一个需要说明的是,我在诊断某个SQL问题时通常习惯将这个SQL创建到一张临时表中,以避免可能对vsql_plan的反复查询带来的消耗。
对于这个案例我只需要发出如下语句:
create table t as select * from v$sql_plan where hash_value=3740055767
通过查询可以获得这条问题SQL的执行计划:
--------------------------------------------------------------------------------
| Operation | PHV/Object Name | Rows | Bytes| Cost |
--------------------------------------------------------------------------------
|INSERT STATEMENT |----- 3740055767 ----| | | 123 |
|SORT GROUP BY | | 2 | 532 | 123 |
| FILTER | | | | |
| HASH JOIN | | 2 | 532 | 115 |
| TABLE ACCESS BY LOCAL INDEX R|CM_URLLOG_JUMP | 1 | 127 | 100 |
| NESTED LOOPS | | 1 | 201 | 100 |
| TABLE ACCESS FULL |HY_SVR_REF_URL_CMCID | 11 | 814 | 3 |
| PARTITION RANGE ITERATOR | | | | |
| BITMAP CONVERSION TO ROWID| | | | |
| BITMAP AND | | | | |
| BITMAP CONVERSION FROM R| | | | |
| SORT ORDER BY | | | | |
| INDEX RANGE SCAN |CMIDX_URLLOG_JUMP_DA | 6 | | 1 |
| BITMAP CONVERSION FROM R| | | | |
| INDEX RANGE SCAN |CMIDX_URLLOG_JUMP_CI | 6 | | 19 |
| TABLE ACCESS FULL |HYO_SERVICEPLAN | 10K| 694K| 14 |
--------------------------------------------------------------------------------
这就是这个SQL的执行计划,来对比一下SQL*Plus中执行这条SQL的执行计划:
Execution Plan
----------------------------------------------------------
0 INSERT STATEMENT Optimizer=CHOOSE (Cost=35 Card=12 Bytes=3192)
1 0 SORT (GROUP BY) (Cost=35 Card=12 Bytes=3192)
2 1 FILTER
3 2 HASH JOIN (Cost=26 Card=12 Bytes=3192)
4 3 HASH JOIN (Cost=11 Card=1 Bytes=201)
5 4 TABLE ACCESS (FULL) OF 'HY_SVR_REF_URL_CMCID' (Cost=3 Card=11 Bytes=814)
6 4 PARTITION RANGE (ITERATOR)
7 6 TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'CM_URLLOG_JUMP' (Cost=6 Card=127720 Bytes=16220440)
8 7 INDEX (RANGE SCAN) OF 'CMIDX_URLLOG_JUMP_DAT'(NON-UNIQUE) (Cost=2 Card=229897)
9 3 TABLE ACCESS (FULL) OF 'HYO_SERVICEPLAN' (Cost=14 Card=10946 Bytes=711490)
注意到这两个执行计划完全不同,速度快的执行计划对于PARTITION RANGE访问是通过一个索引来完成的;而对于速度慢的执行计划,这里却使用了2个索引进行位图转换:
PARTITION RANGE ITERATOR
BITMAP CONVERSION TO ROWIDS
BITMAP AND
BITMAP CONVERSION FROM ROWIDS
SORT ORDER BY
INDEX RANGE SCAN CMIDX_URLLOG_JUMP_DAT
BITMAP CONVERSION FROM ROWIDS
INDEX RANGE SCAN CMIDX_URLLOG_JUMP_CID
正是这个转换使得性能大为缩减。
显然这个错误的选择是由于CMIDX_URLLOG_JUMP_CID索引的存在,再加上绑定变量的影响,CBO最终选择了错误的执行计划,为了快速的解决问题,在确认之后,我们直接Drop掉了这个索引。
此时再次运行过程,发现很快完成,此时的执行计划通过同样的方法可以获得:
--------------------------------------------------------------------------------
| Operation | PHV/Object Name | Rows | Bytes| Cost |
--------------------------------------------------------------------------------
|INSERT STATEMENT |----- 3740055767 ----| | | 35 |
|SORT GROUP BY | | 12 | 3K| 35 |
| FILTER | | | | |
| HASH JOIN | | 12 | 3K| 26 |
| HASH JOIN | | 1 | 201 | 11 |
| TABLE ACCESS FULL |HY_SVR_REF_URL_CMCID | 11 | 814 | 3 |
| PARTITION RANGE ITERATOR | | | | |
| TABLE ACCESS BY LOCAL INDEX|CM_URLLOG_JUMP | 128K| 15M| 6 |
| INDEX RANGE SCAN |CMIDX_URLLOG_JUMP_DA | 230K| | 2 |
| TABLE ACCESS FULL |HYO_SERVICEPLAN | 10K| 694K| 14 |
--------------------------------------------------------------------------------
现在的执行计划恢复了正常。注意到错误的执行计划选择了bitmap convert的执行计划,而两个索引都是B*Tree索引。这种转换是Oracle9i引入的,同时一个隐含参数被用来控制这种转换:
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
2 FROM SYS.x$ksppi x, SYS.x$ksppcv y
3 WHERE x.inst_id = USERENV ('Instance')
4 AND y.inst_id = USERENV ('Instance')
5 AND x.indx = y.indx
6 AND x.ksppinm LIKE '%&par%'
7 /
Enter value for par: _b_tree_bitmap_plans
old 6: AND x.ksppinm LIKE '%&par%'
new 6: AND x.ksppinm LIKE '%_b_tree_bitmap_plans%'
NAME VALUE DESCRIB
------------------------ ------ ------------------------------------
_b_tree_bitmap_plans TRUE enable the use of bitmap plans for tables w. only B-tree indexes
如果这种性能衰减的转换经常发生,可以将这个隐含参数设置为FALSE.




