问题描述
又是一个9208 RAC上的错误,事实上这个错误和上一篇文章中描述的错误相关性很大,因为在上一篇节点关闭并报错的同时,这个节点出现了这个ORA-600错误。
ORA-600(kjccgmb:1)错误:http://yangtingkun.net/?p=245
在当前节点上的详细错误信息为:
Thu Oct 13 18:13:10 2011 IPC Send timeout detected. Sender ospid 1228900 Thu Oct 13 18:13:12 2011 Communications reconfiguration: instance 1 Thu Oct 13 18:13:12 2011 Trace dumping IS performing id=[cdmp_20111013181312] Thu Oct 13 18:13:17 2011 IPC Send timeout detected. Sender ospid 770198 Thu Oct 13 18:13:19 2011 Evicting instance 2 FROM cluster Thu Oct 13 18:13:22 2011 IPC Send timeout detected. Sender ospid 1032208 Thu Oct 13 18:13:31 2011 IPC Send timeout detected. Sender ospid 1302720 Thu Oct 13 18:13:37 2011 IPC Send timeout detected. Sender ospid 438420 Thu Oct 13 18:13:39 2011 Waiting FOR instances TO leave: 2 Thu Oct 13 18:13:47 2011 IPC Send timeout detected. Sender ospid 1474810 Thu Oct 13 18:13:59 2011 Waiting FOR instances TO leave: 2 . . . Thu Oct 13 18:17:22 2011 IPC Send timeout detected. Sender ospid 876652 Thu Oct 13 18:17:24 2011 IPC Send timeout detected. Sender ospid 1654878 Thu Oct 13 18:17:27 2011 IPC Send timeout detected. Sender ospid 1425476 Thu Oct 13 18:17:27 2011 IPC Send timeout detected. Sender ospid 970920 Thu Oct 13 18:17:39 2011 Waiting FOR instances TO leave: 2 Thu Oct 13 18:17:59 2011 Waiting FOR instances TO leave: 2 Thu Oct 13 18:18:19 2011 Waiting FOR instances TO leave: 2 Thu Oct 13 18:18:29 2011 Errors IN file /u01/product/admin/RAC/udump/rac1_ora_1032208.trc: ORA-00600: internal error code, arguments: [kjcsombd:2], [], [], [], [], [], [], [] ORA-03113: end-of-file ON communication channel Thu Oct 13 18:18:37 2011 Errors IN file /u01/product/admin/RAC/udump/rac1_ora_1032208.trc: ORA-00603: ORACLE server SESSION TERMINATED BY fatal error ORA-00600: internal error code, arguments: [kjcsombd:2], [], [], [], [], [], [], [] ORA-03113: end-of-file ON communication channel Thu Oct 13 18:18:38 2011 Trace dumping IS performing id=[cdmp_20111013181838]
专家解答
这个600错误一直重复出现,直到另一个实例启动,对应的详细TRACE信息为:
/u01/product/admin/RAC/udump/rac1_ora_1032208.trc Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production WITH the Partitioning, REAL Application Clusters, OLAP AND Oracle DATA Mining options JServer Release 9.2.0.8.0 - Production ORACLE_HOME = /u01/product/oracle/9.2.0 System name: AIX Node name: p55a1 Release: 3 Version: 5 Machine: 0001D007D600 Instance name: RAC1 Redo thread mounted BY this instance: 1 Oracle process NUMBER: 237 Unix process pid: 1032208, image: oracle@p55a1 (TNS V1-V3) *** SESSION ID:(275.64909) 2011-10-13 18:13:22.043 SKGXPCTX: 0x102c4988 ctx admono 0x3d9a665b admport: SSKGXPT 0x102c4c44 flags active network 0 info FOR network 0 socket no 7 IP 172.16.12.254 UDP 57496 HACMP network_id 0 sflags SSKGXPT_WRITESSKGXPT_UP context TIMESTAMP 0xe5b469 no ports sconno accono ertt state seq# sent async sync rtrans acks 0x5c9264d4 0x07c1249a 32 3 33535 772 772 0 296 771 slot 6 rqh=11035df18 seq=33534 len=424 accno=0x7c1249a START TS=0xe102f0 rt TS=0xe5b7c7 X CNT=297 0x5c9264d5 0x60c4351d 32 3 34041 1278 1278 0 0 1278 0x5c9264d6 0x4201cb1f 32 3 32770 7 7 0 0 7 ach accono sconno admno state seq# rcv rtrans acks Submitting synchronized dump request [268435460] KCL: caught error 3113 during cr LOCK op *** 2011-10-13 18:18:29.055 ksedmp: internal OR fatal error ORA-00600: internal error code, arguments: [kjcsombd:2], [], [], [], [], [], [], [] ORA-03113: end-of-file ON communication channel CURRENT SQL information unavailable - no SESSION. ----- Call Stack Trace ----- calling CALL entry argument VALUES IN hex location TYPE point (? means dubious VALUE) -------------------- -------- -------------------- ---------------------------- ksedmp+0148 bl ksedst 102974684 ? ksfdmp+0018 bl 01FD3FC8 kgerinv+00e8 bl _ptrgl kgeanmfe+0048 bcl kglsim_unpin_simhp+ 000000200 ? 000000000 ? 001c 700000000017CD8 ? 000000000 ? kjcsombdi+0974 bl kgeanmfe 110006288 ? 110357A28 ? 102A73228 ? 000000000 ? 10DC9D3665FC00 ? 8A2477269B180 ? 12E0BE826D694B2F ? 000000077 ? kjcsombd+00a4 bl kjcsombdi BADC0FFEE0DDF00D ? BADC0FFEE0DDF00D ? kjpsod+0fbc bl kjcsombd 70000034CDEECD8 ? 2000004F8 ? kssdch_stage+02b8 bl _ptrgl kssdch+0014 bl kssdch_stage BADC0FFEE0DDF00D ? BADC0FFEE0DDF00D ? BADC0FFEE0DDF00D ? ksudlp+0380 bl kssdch 7000003796230D0 ? 200000002 ? opidcl+020c bl 01FD4824 opidrv+045c bl opidcl 11000D060 ? 0101FAED0 ? sou2o+0028 bl opidrv 3C0C000000 ? 4A0142C60 ? FFFFFFFFFFFF990 ? main+0138 bl 01FD39E0 __start+0098 bl main 000000000 ? 000000000 ?
从TRACE文件不难判断,出现这个问题是由于需要从远端CACHE中获取一致性读的BLOCK,但是在获取过程中碰到了ORA-3113通信中断错误。
显然这个问题与另外的节点关闭直接相关,配合另外节点上的ORA-600错误,怀疑两个节点间的通信在关闭时刻出现异常,从而引发各个节点上出现了不同的ORA-600错误。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。