问题描述 RAC on hpux 11这个版本的ORACLE环境最近半年确实遇到了不少的bug 和 ora-600 错误,最近遇到一个案例,虽然没有致命的危险,但也存在不稳定因素。这里仅记录一下
# DB alert log
Fri Oct 23 14:13:11 2015Thread 2 advanced to log sequence 67897 (LGWR switch) Current log# 7 seq# 67897 mem# 0: /dev/rzwa_redo07 Fri Oct 23 14:13:29 2015Archived Log entry 131456 added for thread 2 sequence 67896 ID 0xffffffffd1cbbdae dest 1: Fri Oct 23 14:17:50 2015Error 3113 trapped in 2PC on transaction 1543.25.297337. Cleaning up. Error stack returned to user:ORA-02050: transaction 1543.25.297337 rolled back, some remote DBs may be in-doubtORA-03113: end-of-file on communication channel ORA-02063: preceding line from BILL Fri Oct 23 14:17:50 2015DISTRIB TRAN ORARPT.7fefb4e8.1039.28.18466 is local tran 1543.25.297337 (hex=607.19.48979) insert pending collecting tran, scn=14584086497354 (hex=d43.9f4b884a)Fri Oct 23 14:17:51 2015Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerFri Oct 23 14:17:51 2015Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_ora_10093494.trc (incident=14361):ORA-00600: internal error code, arguments: [k2srec: should be another instance], [2], [], [], [], [], [], [], [], [], [], []Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerIncident details in: /oracle/app/oracle/diag/rdbms/anbob/anbob2/incident/incdir_14361/anbob2_ora_10093494_i14361.trcErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:ORA-12541: TNS:no listenerErrors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_reco_15598844.trc:...Fri Oct 23 14:18:26 2015Use ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Errors in file /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_ora_10093494.trc (incident=14362):ORA-00600: internal error code, arguments: [18301], [0x000000000], [], [], [], [], [], [], [], [], [], []ORA-00600: internal error code, arguments: [k2srec: should be another instance], [2], [], [], [], [], [], [], [], [], [], []Incident details in: /oracle/app/oracle/diag/rdbms/anbob/anbob2/incident/incdir_14362/anbob2_ora_10093494_i14362.trcFri Oct 23 14:18:26 2015Sweep [inc][14361]: completedSweep [inc2][14361]: completedFri Oct 23 14:18:27 2015Sweep [inc][14362]: completedDumping diagnostic data in directory=[cdmp_20151023141828], requested by (instance=2, osid=10093494), summary=[incident=14362].Use ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Fri Oct 23 14:18:28 2015Sweep [inc2][14362]: completed adrci> show incidentADR Home = /oracle/app/oracle/diag/rdbms/anbob/anbob2:*************************************************************************INCIDENT_ID PROBLEM_KEY CREATE_TIME -------------------- ----------------------------------------------------------- ---------------------------------------- 257 ORA 445 2015-09-11 09:54:11.404000 +08:00 14361 ORA 600 [k2srec: should be another instance] 2015-10-23 14:17:51.117000 +08:00 14362 ORA 600 [18301] 2015-10-23 14:18:26.251000 +08:00 66116 ORA 227 2015-11-17 11:48:11.012000 +08:00 # trace file kdzwa2:/oracle/app/oracle/diag/rdbms/anbob/anbob2/trace> vi /oracle/app/oracle/diag/rdbms/anbob/anbob2/incident/incdir_14361/anbob2_ora_10093494_i14361.trc "/oracle/app/oracle/diag/rdbms/anbob/anbob2/incident/incdir_14361/anbob2_ora_10093494_i14361.trc" Dump file /oracle/app/oracle/diag/rdbms/anbob/anbob2/incident/incdir_14361/anbob2_ora_10093494_i14361.trc Oracle Database 11g Enterprise Edition Release - 64bit ProductionWith the Partitioning, Real Application Clusters, OLAP, Data Miningand Real Application Testing options ORACLE_HOME = /oracle/app/oracle/product/ System name: AIX Node name: kdzwa2 Release: 1Version: 6 Machine: 00F80C614C00 Instance name: anbob2 Redo thread mounted by this instance: 2 Oracle process number: 1795 Unix process pid: 10093494, image: oracle@kdzwa2 *** 2015-10-23 14:17:51.163*** SESSION ID:(450.20079) 2015-10-23 14:17:51.163 *** CLIENT ID:() 2015-10-23 1 4:17:51.163 *** SERVICE NAME:(SYS$USERS) 2015-10-23 14:17:51.163*** MODULE NAME:(oracle@kdexa1db02.anbob.com (TNS V1-V3)) 2015-10-23 14:17:51.163*** ACTION NAME:() 2015-10-23 14:17:51.163 Dump continued from file: /oracle/app/oracle/diag/rdbms/anbob/anbob2/trace/anbob2_ora_10093494.trc ORA-00600: internal error code, arguments: [k2srec: should be another instance], [2], [], [], [], [], [], [], [], [], [], [] ========= Dump for incident 14361 (ORA 600 [k2srec: should be another instance]) ======= *** 2015-10-23 14:17:51.175dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0) ----- SQL Statement (None) -----Current SQL information unavailable - no cursor. ----- Call Stack Trace ---------- Incident Context Dump -----Address: 0xfffffffffff45e8Incident ID: 14361 Problem Key: ORA 600 [k2srec: should be another instance] Error: ORA-600 [k2srec: should be another instance] [2] [] [] [] [] [] [] [] [] [] [][00]: dbgexProcessError [diag_dde][01]: dbgeExecuteForError [diag_dde][02]: dbgePostErrorKGE [diag_dde][03]: dbkePostKGE_kgsf [rdbms_dde][04]: kgeadse [][05]: kgerinv_internal [][06]: kgerinv [][07]: kgeasnmierr [][08]: k2srec []<-- Signaling[09]: k2serv [][10]: opiodr [][11]: ttcpip [][12]: opitsk [][13]: opiino [][14]: opiodr [][15]: opidrv [][16]: sou2o [][17]: opimai_real [][18]: ssthrdmain [][19]: main [][20]: __start []
MOS 中很容易确认了该问题相关的bug
with the following stack :k2srec <- k2serv <- opiodr <- ttcpip
The cause of the issue has been identified on unpublished Bug 14669432. It is caused by a race condition between the foreground process and the RECO process.
1. Download and Apply Patch 14669432 Unpublished Bug 14669432 is fixed in onwards.