问题描述
今天有个朋友在Oracle 10.2.0.5.0 环境遇到了这个报错,记录下分析和处理过程
ORA-00600: internal error code, arguments: [kcblasm_1], [103], [], [], [], [], [], []
# 类似以下语句:
select owner,sum(object_id) from t3
where a.createdate > date'2025-11-01' group by owner;
问题分析
检查执行计划
Plan hash value: 3640192438
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 26 | 260 | 396 (1)| 00:00:01 |
| 1 | HASH GROUP BY | | 26 | 260 | 396 (1)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T3 | 72655 | 709K| 394 (1)| 00:00:01 |
---------------------------------------------------------------------------
9 rows selected.
SQL>
可以看到这里执行了 HASH GROUP BY 操作。
call stack
----- Call Stack Trace ----- calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- ssd_unwind_bp: unhandled instruction at 0x3d06236 instr=f ksedst()+31 call ksedst1() 000000000 ? 000000001 ? 7FFFD8541880 ? 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? ksedmp()+610 call ksedst() 000000000 ? 000000001 ? 7FFFD8541880 ? 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? ksfdmp()+63 call ksedmp() 000000003 ? 000000001 ? 7FFFD8541880 ? 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? kgerinv()+161 call ksfdmp() 006AD9340 ? 000000003 ? 7FFFD8541880 ? 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? kgeasnmierr()+163 call kgerinv() 006AD9340 ? 007548EF8 ? 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? 000000000 ? kcblasm1()+113 call kgeasnmierr() 006AD9340 ? 007548EF8 ? 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? 000000067 ? kcblasm()+52 call kcblasm1() 7F545F539E38 ? 7FFFD8541820 ? 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? 000000067 ? qeshsFindFreeSlot() call kcblasm() 7F545F539E38 ? 000001000 ? +282 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? 000000067 ? qeshfFindFreeSlot() call qeshsFindFreeSlot() 7F545F5C97D0 ? 7FFFD8542924 ? +35 7FFFD85418E0 ? 7FFFD8541820 ? 000000000 ? 000000067 ? qeshBufferAlloc()+3 call qeshfFindFreeSlot() 7F545F5C97D0 ? 7F545F5C8E88 ? 04 7FFFD8542924 ? 7FFFD8541820 ? 000000000 ? 000000067 ? qeshBufferGetNew()+ call qeshBufferAlloc() 7F545F5C8E88 ? 7F5442E14600 ? 170 7F545F5C8FB8 ? 7FFFD8541820 ? 000000000 ? 000000067 ? ksxb1bqb()+178 call qeshBufferGetNew() 7F545F5C8FB0 ? 7F5442E14600 ? 7F545F5C8FB8 ? 7FFFD8541820 ? 000000000 ? 000000067 ? qeshrPackRowQB()+23 call ksxb1bqb() 7F545F5C8FB0 ? 2FECBBB5D4 ? 64 000000002 ? 7FFFD8541820 ? 000000000 ? 000000067 ? qeshrPackRow_Fast() call qeshrPackRowQB() 7F545F5C9620 ? 7F545F5C95A8 ? +1079 000000001 ? 7FFFD8542C04 ? 000000002 ? 000000008 ? qeshLoadRowForGBY() call qeshrPackRow_Fast() 7F545F5C9620 ? 7F545F5C95A8 ? +814 7FFFD8542C04 ? 7F545F485E48 ? 000000008 ? 7F545F5C8FB0 ? qerghRowP()+562 call qeshLoadRowForGBY() 7F545F5C8E88 ? 7F545F485E48 ? 000000008 ? 7F545F485E48 ? 000000008 ? 7F545F5C8FB0 ? kdstf1100101km()+74 call qerghRowP() 49C2DCA2A8 ? 000007FFF ? 6 000000008 ? 7F545F485D58 ? 7F545F485E48 ? 49C2DCA278 ? kdsttgr()+1170 call kdstf1100101km() 000000021 ? 000000000 ? 002FA20CA ? 49C2DCA2A8 ? 2FECBBB5C8 ? 000000002 ? qertbFetch()+650 call kdsttgr() 7F545F485678 ? 000000000 ? 2FECBBA0D0 ? 000000000 ? 2FECBBB5C8 ? 002FA20CA ? qerghFetch()+271 call qertbFetch() 49C2DCA730 ? 002FA20CA ? 49C2DCA2A8 ? 000007FFF ? 49C2DCA7A0 ? 002FA20CA ? rwsfcd()+88 call qerghFetch() 49C2DCA2A8 ? 002EDCDF0 ? 49C2DC9FF0 ? 000007FFF ? 49C2DCA7A0 ? 7F545F485D58 ? qerltFetch()+1008 call rwsfcd() 49C2DCA2A8 ? 002EDCDF0 ? 49C2DC9FF0 ? 000007FFF ? 49C2DCA7A0 ? 7F545F485D58 ? ctcdrv()+8029 call qerltFetch() 49C2DC9FF0 ? 00134EE78 ? 7FFFD85453F0 ? 000000001 ? 49C2DCA7A0 ? 49C2AE2FC0 ? opiexe()+13023 call ctcdrv() 49C24B5E78 ? 000060655 ? 7FFFD85462D8 ? 000000001 ? 66381AF90000001A ? 49C2AE2FC0 ? opiosq0()+3398 call opiexe() 000000004 ? 000000000 ? 7FFFD854686C ? 000000001 ? 66381AF90000001A ? 49C2AE2FC0 ? kpooprx()+318 call opiosq0() 000000003 ? 00000000E ? 7FFFD8546B98 ? 0000000A4 ? 000000000 ? 60000011E ? kpoal8()+783 call kpooprx() 7FFFD8549D7C ? 7FFFD8547CC8 ? 00000011E ? 000000001 ? 000000000 ? 60000011E ? opiodr()+1184 call kpoal8() 00000005E ? 000000017 ? 7FFFD8549D78 ? 000000001 ? 000000001 ? 60000011E ? ttcpip()+1226 call opiodr() 00000005E ? 000000017 ? 7FFFD8549D78 ? 000000000 ? 005BDF250 ? 60000011E ? opitsk()+1310 call ttcpip() 006AE18D0 ? 005499C20 ? 7FFFD8549D78 ? 000000000 ? 7FFFD8549878 ? 7FFFD8549EE0 ? opiino()+1024 call opitsk() 000000003 ? 000000000 ? 7FFFD8549D78 ? 000000001 ? 000000000 ? 6AE001100000001 ? opiodr()+1184 call opiino() 00000003C ? 000000004 ? 7FFFD854AF78 ? 000000001 ? 000000000 ? 6AE001100000001 ? opidrv()+548 call opiodr() 00000003C ? 000000004 ? 7FFFD854AF78 ? 000000000 ? 005BDED00 ? 6AE001100000001 ? sou2o()+114 call opidrv() 00000003C ? 000000004 ? 7FFFD854AF78 ? 000000000 ? 005BDED00 ? 6AE001100000001 ? opimai_real()+163 call sou2o() 7FFFD854AF50 ? 00000003C ? 000000004 ? 7FFFD854AF78 ? 005BDED00 ? 6AE001100000001 ? main()+116 call opimai_real() 000000002 ? 7FFFD854AFE0 ? 000000004 ? 7FFFD854AF78 ? 005BDED00 ? 6AE001100000001 ? __libc_start_main() call main() 000000002 ? 7FFFD854AFE0 ? +245 000000004 ? 7FFFD854AF78 ? 005BDED00 ? 6AE001100000001 ? _start()+41 call __libc_start_main() 000720630 ? 000000002 ? 7FFFD854B138 ? 000000000 ? 005BDED00 ? 6AE001100000001 ?

异常的call stack如下:怀疑在 Kernel Cache Buffer 中查找 FreeSlot时异常报错。
kcblasm1()+113<<kgeasnmierr()<<kcblasm()+52<<kcblasm1()<<qeshsFindFreeSlot()+282 << kcblasm()<<qeshfFindFreeSlot()+35 << qeshsFindFreeSlot() <<qeshBufferAlloc()+304 << qeshfFindFreeSlot()<<qeshBufferGetNew()+170 << qeshBufferAlloc()
mos解释
Applies to: Oracle Server - Enterprise Edition - Version: 10.2.0.5.0 and later [Release: 10.2 and later ] Information in this document applies to any platform. Symptoms When executing a statement using a SELECT getting: ORA-00600: internal error code, arguments: [kcblasm_1], [103], [], [], [], [], [], [] The execution plan from the ORA-600 trace file is showing hash join is used. Call Stack is including: kcblasm1 kcblasm kxhfFndFreeSlot kxhfNewBuffer qerhjGetNewBuffer ksxb1bqb kxhrPack Cause The issue was investigated in: Bug 9800302 - 10.2.0.5 GETTING ORA-00600 [KCBLASM_1] closed as duplicate of:Bug 7612454 - Abstract: DSS:PERF REGRESSIONS IN SERIAL DIRECT READS fixed in 11.2. As per development team the number of slots available for direct I/Os (limited to 4096) forced the hash-join algorithm to operate on fewer number of slots and resulted in more spills to disk. This caused: direct path IO to perform. worse in 10.2.0.5 than earlier releases with more "direct path read" operations or ORA-600 [kcblasm_1] errors. Solution 1. Upgrade the database to 11.2. OR 2. Apply &incFamilyProds=false&flag=search))" target=_blank>Patch 7612454 available on MOS. If a patch is not currently available on top of your database version and/or platform. please raise a Service Request to request for it. Please be sure that your database version qualifies for getting a new patch as per Note 209768.1 and Note 742060.1. OR 3. Use the workaround of setting: "_hash_join_enabled"= false Applies to: Oracle Server - Enterprise Edition - Version: 10.2.0.5.0 and later [Release: 10.2 and later ] Information in this document applies to any platform. Symptoms When executing a statement using a SELECT getting: ORA-00600: internal error code, arguments: [kcblasm_1], [103], [], [], [], [], [], [] The execution plan from the ORA-600 trace file is showing hash join is used. Call Stack is including: kcblasm1 kcblasm kxhfFndFreeSlot kxhfNewBuffer qerhjGetNewBuffer ksxb1bqb kxhrPack Cause The issue was investigated in: Bug 9800302 - 10.2.0.5 GETTING ORA-00600 [KCBLASM_1] closed as duplicate of:Bug 7612454 - Abstract: DSS:PERF REGRESSIONS IN SERIAL DIRECT READS fixed in 11.2. As per development team the number of slots available for direct I/Os (limited to 4096) forced the hash-join algorithm to operate on fewer number of slots and resulted in more spills to disk. This caused: direct path IO to perform. worse in 10.2.0.5 than earlier releases with more "direct path read" operations or ORA-600 [kcblasm_1] errors. Solution 1. Upgrade the database to 11.2. OR 2. Apply &incFamilyProds=false&flag=search))" target=_blank>Patch 7612454 available on MOS. If a patch is not currently available on top of your database version and/or platform. please raise a Service Request to request for it. Please be sure that your database version qualifies for getting a new patch as per Note 209768.1 and Note 742060.1. OR 3. Use the workaround of setting: "_hash_join_enabled"= false
从mos中判断,这个问题是10.2.0.5 bug,需要升级到11.2或打patch解决,这个bug的根因:
1. SERIAL DIRECT READS 2. the number of slots available for direct I/Os (limited to 4096) 3. forced the hash-join algorithm to operate on fewer number of slots and resulted in more spills to disk.
由于暂时不能升级版本和打补丁,临时解决方案如下:
# 方案一、禁用hash group by
alter session set "_gby_hash_aggregation_enabled" = false;
# 方案二、禁用hash_join # 我的语句中没有hash所以和mos还有区别
alter session set "_hash_join_enabled"=false;
# 方案三、改sql减少group by的分组数,如修改为多个sql最后union
# 方案四、禁用direct path read
# 方案五、启动并行 parallel(32) <--尝试方案
# 方案六、with改写 <--把需要的列和数据用with选择出来再group
# 方案七、创建索引或索引覆盖,减少访问的blocks
问题根因
As per development team the number of slots available for direct I/Os (limited to 4096) forced the hash-join algorithm to operate on fewer number of slots and resulted in more spills to disk. This caused: direct path IO to perform. worse in 10.2.0.5 than earlier releases with more "direct path read" operations or ORA-600 [kcblasm_1] errors.
解决方案
# 单表操作
方案一、禁用hash group by
alter session set "_gby_hash_aggregation_enabled" = false;
# hash join的操作
方案二、禁用hash_join
alter session set "_hash_join_enabled"=false;
朋友回复方案一操作后,问题得到解决。
最后修改时间:2026-01-18 13:52:46
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




