问题描述
在一个客户的数据库告警日志中,频繁出现ORA-600(kdsgrp1)错误。
详细错误信息为:
/oracle/admin/tjsb/udump/tjsb1_ora_1701138.trc
Oracle DATABASE 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
WITH the Partitioning, REAL Application Clusters, DATA Mining AND REAL Application Testing options
ORACLE_HOME = /oracle/product/10.2.0/db_1
System name: AIX
Node name: p570c
Release: 3
Version: 5
Machine: 00C971744C00
Instance name: tjsb1
Redo thread mounted BY this instance: 1
Oracle process NUMBER: 883
Unix process pid: 1701138, image: oracle@p570c
*** 2011-07-26 09:59:41.111
*** ACTION NAME:(abc) 2011-07-26 09:59:41.105
*** MODULE NAME:(archive.exe) 2011-07-26 09:59:41.105
*** SERVICE NAME:(tjsb) 2011-07-26 09:59:41.105
*** SESSION ID:(2466.64103) 2011-07-26 09:59:41.105
ROW 0766ce34.a continuation at
file# 29 block# 2543156 slot 11 NOT found
**************************************************
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 11 ..... nrows: 110
**************************************************
*** 2011-07-26 09:59:41.111
ksedmp: internal OR fatal error
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
CURRENT SQL statement FOR this SESSION:
SELECT DISTINCT KC01.AAC001 AAC001 FROM AC04, AC02, KC01 WHERE KC01.AAB001 = :B2 AND KC01.AKC021 IN ('11', '12') AND AC02.AAE140 = '
3' AND AC02.AAC031 = '1' AND AC02.AAC001 = KC01.AAC001 AND AC04.AAC001 = KC01.AAC001 AND AC04.AAE030 <= TO_DATE(:B1 , 'yyyymm') AND
(AC04.AAE031 >= TO_DATE(:B1 , 'yyyymm') OR AC04.AAE031 IS NULL) AND NVL(AC04.AKC010, 0) > 0 AND NOT EXISTS (SELECT 1 FROM AC01 WHERE
AAB001 = :B2 AND NVL(AAC033, '0') = '3' AND AAC001 = KC01.AAC001) AND NOT EXISTS (SELECT 1 FROM DUAL WHERE :B3 <= (SELECT NVL(SUM(C
KC010), 0) FROM KC33 WHERE AAC001 = KC01.AAC001 AND AAE001 = TO_NUMBER(SUBSTR(:B1 , 1, 4)))) UNION SELECT DISTINCT KC01.AAC001 AAC00
1 FROM AC02, KC01 WHERE KC01.AAB001 = :B2 AND KC01.AKC021 IN ('21', '22', '23') AND AC02.AAE140 = '3' AND AC02.AAC031 = '1' AND AC02
.AAC001 = KC01.AAC001 AND NOT EXISTS (SELECT 1 FROM AC01 WHERE AAB001 = :B2 AND NVL(AAC033, '0') > '0' AND AAC001 = KC01.AAC001) AND
NOT EXISTS (SELECT 1 FROM KC33 WHERE AAC001 = KC01.AAC001 AND AAE002 = :B1 ) AND (NOT EXISTS (SELECT 1 FROM DUAL WHERE :B4 <= (SELE
CT NVL(SUM(CKC010), 0) FROM KC33 WHERE AAC001 = KC01.AAC001 AND AAE001 = TO_NUMBER(SUBSTR(:B1 , 1, 4))) AND NVL(TO_CHAR(KC01.AIC162,
'yyyymm'),'199001') < SUBSTR(:B1 , 1, 4) || '01')) AND (NOT EXISTS (SELECT 1 FROM DUAL WHERE :B3 <= (SELECT NVL(SUM(CKC010), 0) FRO
M KC33 WHERE AAC001 = KC01.AAC001 AND AAE001 = TO_NUMBER(SUBSTR(:B1 , 1, 4))) AND NVL(TO_CHAR(KC01.AIC162, 'yyyymm'),'199001') >= SU
BSTR(:B1 , 1, 4) || '01'))
----- PL/SQL Call Stack -----
object line object
handle NUMBER name
70000078a8a6960 8456 package body TJSI.PKG_A_JJZJ
70000078bd54dc0 1 anonymous block
----- Call Stack Trace -----
calling CALL entry argument VALUES IN hex
location TYPE point (? means dubious VALUE)
-------------------- -------- -------------------- ----------------------------
ksedst+001c bl ksedst1 000000000 ? 000000000 ?
ksedmp+0290 bl ksedst 104522370 ?
ksfdmp+0018 bl 03F2C1FC
kgerinv+00dc bl _ptrgl
kgeasnmierr+004c bl kgerinv 000000000 ? 000000001 ?
10D00000000 ? 000000000 ?
110648E58 ?
kdsgrp+0460 bl 01F92FB8
kdsfbr+0268 bl kdsgrp 000000001 ? 1105E2988 ?
000000000 ?
qertbFetchByRowID+0 bl kdsfbr FFFFFFFFFFF1FA0 ? 000000000 ?
9d0 1040DC78C ? 000000000 ?
1105FC720 ? 1101FB888 ?
1105FCF40 ? 000000000 ?
rwsfcd+0054 bl _ptrgl
qerflFetchOutside+0 bl 03F2BB90
09c
rwsfcd+0054 bl _ptrgl
qeruaFetch+013c bl 03F2BB90
qersoFetch+0110 bl 01F92FBC
opifch2+141c bl 01F92FBC
opifch+003c bl opifch2 000000001 ? 11062BDC0 ?
FFFFFFFFFFF36D0 ?
opipls+21a4 bl opifch FFFFFFFFFFF3858 ? 110173648 ?
FFFFFFFFFFF3870 ?
opiodr+0ae0 bl _ptrgl
rpidrus+01bc bl opiodr 6686B0E7F0 ? 684B93430 ?
FFFFFFFFFFF67E0 ? D86B0E750 ?
skgmstack+00c8 bl _ptrgl
rpidru+0088 bl skgmstack 110173648 ? FFFFFFFFFFF5830 ?
000000002 ? 000000000 ?
FFFFFFFFFFF5FA8 ?
rpiswu2+034c bl _ptrgl
rpidrv+095c bl rpiswu2 7000007837E2BD8 ?
3A00000000 ? 10451BB80 ?
000000000 ? 110196C78 ?
3A10196C60 ?
FFFFFFFFFFF6008 ? 000000000 ?
psddr0+02bc bl 01F96D24
psdnal+01d0 bl psddr0 D013BC57C ? 6610573D78 ?
FFFFFFFFFFF67E0 ?
3000000000 ?
pevm_BFTCHC+03b4 bl _ptrgl
pfrinstr_FTCHC+0138 bl pevm_BFTCHC 101479974 ? 000000000 ?
000000010 ?
pfrrun_no_tool+005c bl _ptrgl
pfrrun+1014 bl pfrrun_no_tool FFFFFFFFFFF6B20 ?
70000078BD54DC0 ? 3000009BB ?
plsql_run+06b4 bl pfrrun 110469588 ?
peicnt+0224 bl plsql_run 110469588 ? 10001102381F8 ?
000000000 ?
kkxexe+0250 bl peicnt FFFFFFFFFFF7E38 ? 110469588 ?
opiexe+2ef8 bl kkxexe 11064CEA0 ?
kpoal8+0edc bl opiexe FFFFFFFFFFFB454 ?
FFFFFFFFFFFB180 ?
FFFFFFFFFFF9628 ?
opiodr+0ae0 bl _ptrgl
ttcpip+1020 bl _ptrgl
opitsk+1124 bl 01F95258
opiino+0990 bl opitsk 0FFFFD490 ? 000000000 ?
opiodr+0ae0 bl _ptrgl
opidrv+0484 bl 01F947C0
sou2o+0090 bl opidrv 3C02D9855C ? 44065F000 ?
FFFFFFFFFFFF390 ?
opimai_real+01bc bl 01F92214
main+0098 bl opimai_real 000000000 ? 000000000 ?
__start+0098 bl main 000000000 ? 000000000 ?专家解答
查询了一下metalink,发现错误号为kdsgrp1的已知600错误有20多个之多。当前的版本是10204,而这些已知bug又有80%都可能在这个版本上发生。不过仔细看一下错误的说明,这些错误有几分相似之处,多半都和表或索引的逻辑损坏有关。
在这个trace文件中,有明确的标识损坏之处:
ROW 0766ce34.a continuation at file# 29 block# 2543156 slot 11 NOT found
显然这里出现了一个逻辑坏块,那么其实不用再去关心当前的问题属于哪个已知的bug,只要解决了这个坏块,就解决了这个ORA-600的错误。
有了FILE和BLOCK的序号,可以通过DUMP的手段,也可以查询DBA_EXTENTS视图,来确认坏块的内容。
如果坏块属于索引,那么可以通过删除索引然后重建的方式,很容易的解决问题。
如果坏块属于表,那么坏块中的部分行信息可能丢失。如果找不到出现错误之前的备份,可以通过DBMS_REPAIR包进行修复,或者设置alter session set events ‘10231 trace name context forever, level 10’,然后导出或CREATE TABLE AS SELECT将坏块意外的数据进行重建。
可以参考metalink文档ID 468883.1,获取更详细的信息。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




