问题描述
刚接手一套数据库,数据库执行SQL时提示 ORA-00600: internal error code, arguments: [6704], [2], [1405099], [953234], [], [], [], [], [], [], [], []],下面简单记录一下。
#CHECK DB Alert log
adrci> show incident ADR Home = /oracle/app/oracle/diag/rdbms/anbob/anbob1: ************************************************************************* INCIDENT_ID PROBLEM_KEY CREATE_TIME -------------------- ----------------------------------------------------------- ---------------------------------------- 604431 ORA 603 2014-01-01 03:47:35.535000 +08:00 607239 ORA 603 2014-01-01 03:48:06.268000 +08:00 654545 ORA 600 [kgantc_1] 2014-01-27 15:29:22.834000 +08:00 675729 ORA 603 2014-01-27 15:32:50.200000 +08:00 677322 ORA 600 [FILE:psdmsc.c LINE:795 ID:OCIKCallPush] 2014-04-01 17:19:01.494000 +08:00 681378 ORA 603 2014-06-01 05:05:39.366000 +08:00 680218 ORA 603 2014-06-01 05:05:39.417000 +08:00 676178 ORA 445 2014-07-01 08:21:36.993000 +08:00 676010 ORA 445 2014-08-01 00:29:27.336000 +08:00 680154 ORA 603 2014-09-01 08:14:35.356000 +08:00 679754 ORA 445 2014-09-01 09:21:00.207000 +08:00 701471 ORA 603 2014-09-02 06:43:20.432000 +08:00 700543 ORA 603 2014-09-02 06:44:01.439000 +08:00 702671 ORA 603 2014-09-02 06:49:17.915000 +08:00 700927 ORA 600 [kdddgb2] 2014-10-01 00:01:18.243000 +08:00 701935 ORA 600 [6704] <<<< 2014-10-01 00:02:45.452000 +08:00 701927 ORA 600 [kdddgb2] 2014-10-01 00:02:45.464000 +08:00 723749 ORA 603 2014-10-10 14:00:45.568000 +08:00 726558 ORA 600 [6704] <<<< 2014-11-01 00:01:32.396000 +08:00 724174 ORA 600 [6704] 2014-11-01 00:01:32.396000 +08:00 ... 747790 ORA 603 2014-11-25 17:15:22.582000 +08:00 747800 ORA 603 2014-11-25 17:16:17.296000 +08:00 725310 ORA 600 [6704] <<<<< 2014-12-01 00:01:42.253000 +08:00 724030 ORA 445 2014-12-01 05:43:53.934000 +08:00 751458 ORA 603 2014-12-01 09:20:27.569000 +08:00 751402 ORA 603 2014-12-02 11:12:37.346000 +08:00 753066 ORA 603 2014-12-03 11:45:00.517000 +08:00 747890 ORA 32701 2014-12-22 01:19:10.370000 +08:00 747891 ORA 32701 2014-12-22 01:20:46.883000 +08:00 42 rows fetched adrci> show incident -mode detail -p "incident_id=725310" ADR Home = /oracle/app/oracle/diag/rdbms/anbob/anbob1: ************************************************************************* ********************************************************** INCIDENT INFO RECORD 1 ********************************************************** INCIDENT_ID 725310 STATUS ready CREATE_TIME 2014-12-01 00:01:42.253000 +08:00 PROBLEM_ID 8 CLOSE_TIME FLOOD_CONTROLLED none ERROR_FACILITY ORA ERROR_NUMBER 600 ERROR_ARG1 6704 ERROR_ARG2 2 ERROR_ARG3 1405099 ERROR_ARG4 953234 ERROR_ARG5 ERROR_ARG6 ERROR_ARG7 ERROR_ARG8 ERROR_ARG9 ERROR_ARG10 ERROR_ARG11 ERROR_ARG12 SIGNALLING_COMPONENT SIGNALLING_SUBCOMPONENT SUSPECT_COMPONENT SUSPECT_SUBCOMPONENT ECID IMPACTS 0 PROBLEM_KEY ORA 600 [6704] FIRST_INCIDENT 701935 FIRSTINC_TIME 2014-10-01 00:02:45.452000 +08:00 LAST_INCIDENT 725310 LASTINC_TIME 2014-12-01 00:01:42.253000 +08:00 IMPACT1 0 IMPACT2 0 IMPACT3 0 IMPACT4 0 KEY_NAME ProcId KEY_VALUE 194.83 KEY_NAME SID KEY_VALUE 147.52491 KEY_NAME PQ KEY_VALUE (25999740, 1417363285) KEY_NAME Client ProcId KEY_VALUE oracle@anbob (TNS V1-V3).852726_1 OWNER_ID 1 INCIDENT_FILE /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc OWNER_ID 1 INCIDENT_FILE /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_852726.trc 1 rows fetched adrci> show trace /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc Output the results to file: /tmp/utsout_4719434_1_2.ado "/tmp/utsout_4719434_1_2.ado" 109538 lines, 6682863 characters /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc ---------------------------------------------------------- LEVEL PAYLOAD ----- ------------------------------------------------------------------------------------------------------------------------------------ Dump file /oracle/app/oracle/diag/rdbms/anbob/anbob1/incident/incdir_725310/anbob1_ora_852726_i725310.trc Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label Security, OLAP, Data Mining, Oracle Database Vault and Real Application Testing option ORACLE_HOME = /oracle/app/oracle/product/11.2.0/db_1 System name: AIX Node name: anbob Release: 1 Version: 6 Machine: 00F87EE44C00 Instance name: anbob1 Redo thread mounted by this instance: 1 Oracle process number: 194 Unix process pid: 852726, image: oracle@anbob *** 2014-12-01 00:01:42.302 *** SESSION ID:(147.52491) 2014-12-01 00:01:42.302 *** CLIENT ID:() 2014-12-01 00:01:42.302 *** SERVICE NAME:(anbob_1_2) 2014-12-01 00:01:42.302 *** MODULE NAME:(accountindb@weejar281 (TNS V1-V3)) 2014-12-01 00:01:42.302 *** ACTION NAME:() 2014-12-01 00:01:42.302 Dump continued from file: /oracle/app/oracle/diag/rdbms/anbob/anbob1/trace/anbob1_ora_852726.trc 1> ***** Error Stack ***** ORA-00600: internal error code, arguments: [6704], [2], [1405099], [953234], [], [], [], [], [], [], [], [] 1< ***** Error Stack ***** 1> ***** Dump for incident 725310 (ORA 600 [6704]) ***** *** 2014-12-01 00:01:42.314 dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0) 2> ***** Current SQL Statement for this session (sql_id=2f7xhhh2svw9d) ***** update tabxxx set bundle_id = :bundle_id, customer_id = :cust_id, region_code=:region_code, pay_unit = :pay_unit, real_u nit = :real_unit, toll_unit = :toll_unit, call_times = :call_times, discharge = :discharge, fee1 = :fee1, fee2 = :fee2, fee3 = :fee3, fee4 = :fee4, fee11= :fee11, fee12 = :fee12, fee13 = :fee13, fee14 = :fee14 ,item_source_id=:item_source_id,strategy_id=:strategy_id where user_id = :user_id and account_id = :account_id and fee_kind = :fee_kind and city_code = :city_code and service_kind = :service_kind and special_weejar = 0 and fee_date = '201411' returning 1 into :update_rowid 2< ***** current_sql_statement ***** 2> ***** Call Stack Trace ***** calling call entry argument values in hex location type point (? means dubious value) -------------------- -------- -------------------- ---------------------------- skdstdst()+40 bl 107c60464 000000000 ? 000000001 ? ksedst1()+112 call skdstdst() 13AB7C69F823AC58 ? ksedst()+40 call ksedst1() 30300000000 ? 002050033 ? dbkedDefDump()+1516 call ksedst() 000000000 ? 000000000 ? FFFFFFFFFFEFF20 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ? 300000003 ? ksedmp()+72 call dbkedDefDump() 300000000 ? 110000AC8 ? FFFFFFFFFFF0510 ? 4842424409A0F964 ? 10011CA24 ? 109A0F958 ? FFFFFFFFFFF0560 ? 11065B3A0 ? ksfdmp()+100 call ksedmp() 000000000 ? 000000000 ? 009A0F973 ? 10AB28B98 ? 200000000000000 ? 000000000 ? 11089C290 ? 1106D93F8 ? dbgexPhaseII()+1940 call ksfdmp() 000002004 ? 1106D93F8 ? 000000000 ? FFFFFFFFFFF06D8 ? FFFFFFFFFFF0600 ? 000000000 ? 10024B778 ? 110000B00 ? dbgexProcessError() call dbgexPhaseII() 1106D93F8 ? 1108A1978 ? +932 0000B113E ? 200000000 ? FFFFFFFFFFF12E8 ? 00000006C ? 700000C13FE97C8 ? 1106E3298 ? dbgeExecuteForError call dbgexProcessError() 1106D93F8 ? 11089C290 ? ()+72 1FFFF46D0 ? 000000000 ? 000000000 ? 00000000F ? 022C18D82 ? 11089DFD8 ? dbgePostErrorKGE()+ call dbgeExecuteForError 000000000 ? 000000004 ? 2048 () 0FFFFFFFF ? 110000AC8 ? dbkePostKGE_kgsf()+ call dbgePostErrorKGE() 110000CB0 ? 1099D5F20 ? 68 25800000000 ? 000000000 ? kgeade()+364 call dbkePostKGE_kgsf() 000000000 ? 000000000 ? kgeriv_int()+116 call kgeade() 000000038 ? 700000B314914D0 ? kgeriv()+36 call kgeriv_int() 700000B31491450 ? 0000000BC ? 700000B3C009028 ? 110BDB560 ? FFFFFFFFFFF55C0 ? 4422488139346014 ? 10580E62C ? 000000002 ? kgeasi()+232 call kgeriv() 110BF4490 ? 110A63E38 ? 110B76700 ? FFFFFFFFFFF6208 ? 700000B3C971BF0 ? 110A63E38 ? 000000000 ? 0000000A9 ? kdugetpart()+272 call kgeasi() 109046C04 ? E8B92000E8B92 ? updrow()+15796 call kdugetpart() 10089DEFC ? 000000007 ? qerupFetch()+856 call updrow() 000000000 ? 7FFF00000000 ? 000000000 ? 000000000 ? FFFFFFFFFFF5F90 ? 110000CB0 ? FFFFFFFFFFF64F0 ? 000000001 ? updaul()+1100 call qerupFetch() 700000B3F7AC068 ? 110BF45A8 ? 000000000 ? 000000000 ? 7FFF00000000 ? 000000000 ? 000000000 ? 000000000 ? updThreePhaseExe()+ call updaul() 700000B3C971BF0 ? 336 FFFFFFFFFFF7078 ? 000000000 ? updexe()+344 call updThreePhaseExe() 1FFFF78A8 ? 100000001 ? opiexe()+14476 call updexe() 700000B3C971BF0 ? 300000000 ? 000000000 ? FFFFFFFFFFF78A8 ? 000000000 ? FFFFFFFFFFF7AA8 ? FFFFFFFFFFF79E0 ? 000000000 ? kpoal8()+4616 call opiexe() FFFFFFFFFFF8C90 ? 11065B3A0 ? opiodr()+720 call kpoal8() 1FFFF9D30 ? 200000000 ? ttcpip()+1028 call opiodr() 5E1010C870 ? 1C000003C8 ? FFFFFFFFFFFA890 ? 000000000 ? opitsk()+1508 call ttcpip() 11010C870 ? 11065B3A0 ? opiino()+940 call opitsk() 110043CC0 ? 000000000 ? opiodr()+720 call opiino() 3C009A7C7C ? opidrv()+1132 call opiodr() 3C0AA3BE28 ? 41010B878 ? 11065B3A0 ? sou2o()+136 call opidrv() 3C07E7DA50 ? 400000000 ? 11065B3A0 ? opimai_real()+608 call sou2o() FFFFFFFFFFFF440 ? 000000001 ? ssthrdmain()+268 call opimai_real() 9001000A070DE30 ? 9001000A077BD18 ? sou2o()+136 call opidrv() 3C07E7DA50 ? 400000000 ? 11065B3A0 ? opimai_real()+608 call sou2o() FFFFFFFFFFFF440 ? 000000001 ? A0000000A000000 ? 10B146F30 ? ssthrdmain()+268 call opimai_real() 9001000A070DE30 ? 9001000A077BD18 ? main()+204 call ssthrdmain() 2F0003640 ? FFFFFFFFFFFF888 ? __start()+112 call main() 000000000 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ? 000000000 ? 2< ***** call_stack_dump *****
专家解答
TIP:
从今年的错误来看,该错误具有周期性,只在1号出现,应该是帐期特殊的事务触发的问题,从错误的Trace文件分析,应该是从10月份开始该SQL触发, 引起的ora-600 【6704】
ORA-600 [6704] "bad internal rowid when updating partition table" ORA-00600: [6704] ARGUMENTS: Arg [a] 1 for a partitioned table, 2 for a non-partitioned table Arg [b] Data object number from the table Arg [c] Data object number from the input ROWID
SQL> @ddl WEEJAR.tabxxx PL/SQL procedure successfully completed. DBMS_METADATA.GET_DDL(OBJECT_TYPE,OBJECT_NAME,OWNER) ----------------------------------------------------------------------------------------------------------------- CREATE TABLE "WEEJAR"."tabxxx" ( "SERVICE_KIND" NUMBER(4,0) NOT NULL ENABLE, "USER_ID" NUMBER(12,0) NOT NULL ENABLE, "ACCOUNT_ID" NUMBER(12,0) NOT NULL ENABLE, "CUSTOMER_ID" NUMBER(12,0) NOT NULL ENABLE, "BUNDLE_ID" NUMBER(10,0) DEFAULT 0 NOT NULL ENABLE, ... "FEE14" NUMBER(12,2) DEFAULT 0 NOT NULL ENABLE, "ITEM_SOURCE_ID" NUMBER(9,0) DEFAULT 0 NOT NULL ENABLE, "STRATEGY_ID" NUMBER(9,0) DEFAULT 0 NOT NULL ENABLE ) PCTFREE 10 PCTUSED 0 INITRANS 1 MAXTRANS 255 NOCOMPRESS NOLOGGING STORAGE( BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "TS_TAB_GATHER" PARTITION BY RANGE ("CITY_CODE","FEE_DATE") (PARTITION "PART_000_MAX" VALUES LESS THAN ('000', MAXVALUE) SEGMENT CREATION DEFERRED PCTFREE 10 PCTUSED 0 INITRANS 1 MAXTRANS 255 NOCOMPRESS NOLOGGING STORAGE( INITIAL 16384 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "TS_TAB_GATHER" , PARTITION "PART_180_201308" VALUES LESS THAN ('180', '201309') SEGMENT CREATION IMMEDIATE PCTFREE 10 PCTUSED 0 INITRANS 1 MAXTRANS 255 NOCOMPRESS NOLOGGING STORAGE(INITIAL 8388608 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) TABLESPACE "TS_TAB_GATHER" , ...
奇怪的是2是 non-partitioned ,但该表分区。
SQL> select owner,data_object_id,subobject_name,object_name,object_type,created from dba_objects where object_name='tabxxx' and owner='WEEJAR' and data_object_id in(1405099,953234) OWNER DATA_OBJECT_ID SUBOBJECT_NAME OBJECT_NAME OBJECT_TYPE CREATED ------------------------------ -------------- -------------------- -------------------- ------------------- ----------------- WEEJAR 953234 PART_189_201411 tabxxx TABLE PARTITION 20130923 12:05:19 WEEJAR 1405099 PART_720_201312 tabxxx TABLE PARTITION 20130923 12:05:19
从上面的TRACE 文件中确认是下面的SQL 引起:
update tabxxx set bundle_id = :bundle_id, customer_id = :cust_id, region_code=:region_code, pay_unit = :pay_unit, real_u nit = :real_unit, toll_unit = :toll_unit, call_times = :call_times, discharge = :discharge, fee1 = :fee1, fee2 = :fee2, fee3 = :fee3, fee4 = :fee4, fee11= :fee11, fee12 = :fee12, fee13 = :fee13, fee14 = :fee14 ,item_source_id=:item_source_id,strategy_id=:strategy_id where user_id = :user_id and account_id = :account_id and fee_kind = :fee_kind and city_code = :city_code and service_kind = :service_kind and special_weejar = 0 and fee_date = '201411' returning 1 into :update_rowid;
根据MOS 上查询,目前只有这一个未公开BUG和目前发现的3个现象,现象3# 很相似。
Bug 8768508 - ORA-600: [6704] on merge against Global Temporary Table (文档 ID 8768508.8) Versions confirmed as being affected 11.2.0.3 10.2.0.4 The fix for 8768508 is first included in 12.1.0.1 Release Rediscovery Notes 1) You are executing a MERGE into a GLOBAL TEMP table that contains no rows. 2) The MERGE is executed within a FORALL loop. 3) An ORA-00600: [6704] is raised when updating a row.
分析:
To do the following:
analyze table partition () validate structure cascade into invalid_rows;
most likely you will need to execute $ORACLE_HOME/rdbms/admin/utlvalid.sql if the table is partitioned
analyze table WEEJAR.tabxxx partition (PART_189_201411) validate structure cascade into invalid_rows ;
Note:
analyze table will requested TM lock. you can use “online” option like the following:
analyze table WEEJAR.tabxxx partition (PART_189_201411) validate structure cascade online into invalid_rows ;
”
ONLINE option for ANALYZE VALIDATE STRUCTURE statement
You can specify that you want to perform structure validation online while DML is occurring against the object being validated.
There can be a slight performance impact when validating with ongoing DML affecting the object, but this is offset by the flexibility
of being able to perform ANALYZE online.
”
SQL> analyze table WEEJAR.tabxxx partition (PART_189_201411) validate structure cascade online into invalid_rows ; Table analyzed. SQL> select * from invalid_rows; no rows selected
Note:
另一个分区表对象被删除了,所以无法判断是不是在另一个分区表上的索引有逻辑corrupt block. 但是又观察了一个月发现没有再出现。
另外ora-600 kgantc_1 一般是因为ctrl+c 或kill -9后引起的。