问题描述
公司的测试服务环境被整崩溃了,简单记录一下错误。
同事发现一个测试数据库无法登录,对数据库执行SHUTDOWN ABORT后,重新启动bootstrap报错:
[ora11g@hpserver2 ~]$ sqlplus / as sysdba SQL*Plus: Release 11.2.0.3.0 Production on Sun Mar 4 00:37:58 2012 Copyright (c) 1982, 2011, Oracle. All rights reserved. Connected to an idle instance. SQL> startup ORACLE instance started. Total System Global Area 2555744256 bytes Fixed Size 2230912 bytes Variable Size 637535616 bytes Database Buffers 1862270976 bytes Redo Buffers 53706752 bytes Database mounted. ORA-01092: ORACLE instance terminated. Disconnection forced ORA-00704: bootstrap process failure ORA-00604: error occurred at recursive SQL level 1 ORA-08103: object no longer exists Process ID: 20964 Session ID: 125 Serial number: 5
专家解答
数据库怎么会突然出现这么严重的问题,检查告警日志发现了如下的错误:
Errors in file /u02/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_31804.trc (incident=127465): ORA-00600: internal error code, arguments: [ktecgsc:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], [] Incident details in: /u02/app/oracle/diag/rdbms/orcl11g/orcl11g/incident/incdir_127465/orcl11g_ora_31804_i127465.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Dumping diagnostic data in directory=[cdmp_20120302133841], requested by (instance=1, osid=31804), summary=[incident=127465]. Sweep [inc][127465]: completed Sweep [inc2][127465]: completed Errors in file /u02/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_31900.trc (incident=127449): ORA-00600: internal error code, arguments: [kkdlGetBaseUser2:authIdType], [0], [27], [GLOBAL_AQ_USER_ROLE], [], [], [], [], [], [], [], [] Incident details in: /u02/app/oracle/diag/rdbms/orcl11g/orcl11g/incident/incdir_127449/orcl11g_ora_31900_i127449.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Dumping diagnostic data in directory=[cdmp_20120302134009], requested by (instance=1, osid=31900), summary=[incident=127449]. Sweep [inc][127449]: completed Sweep [inc2][127449]: completed
检查第一个ORA-600对应的TRACE,发现出现错误的会话在执行导入:
[ora11g@hpserver2 ~]$ more /u02/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_31804.trc Trace file /u02/app/oracle/diag/rdbms/orcl11g/orcl11g/trace/orcl11g_ora_31804.trc Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production With the Partitioning, Oracle Label Security and Real Application Testing options ORACLE_HOME = /u02/app/oracle/product/11.2.0/db_1 System name: Linux Node name: hpserver2.enmotech.com Release: 2.6.32-100.28.5.el6.x86_64 Version: #1 SMP Wed Feb 2 18:40:23 EST 2011 Machine: x86_64 Instance name: orcl11g Redo thread mounted by this instance: 1 Oracle process number: 29 Unix process pid: 31804, image: oracle@hpserver2.enmotech.com (TNS V1-V3) *** 2012-03-02 13:38:34.271 *** SESSION ID:(141.3959) 2012-03-02 13:38:34.271 *** CLIENT ID:() 2012-03-02 13:38:34.271 *** SERVICE NAME:(SYS$USERS) 2012-03-02 13:38:34.271 *** MODULE NAME:(imp@hpserver2.enmotech.com (TNS V1-V3)) 2012-03-02 13:38:34.271 *** ACTION NAME:() 2012-03-02 13:38:34.271 *** SESSION ID:(141.3959) 2012-03-02 13:38:34.270 DATA seg.obj=-2, on-disk obj=27, dsflg=0, dsobj=74, cls=4 Incident 127465 created, dump file: /u02/app/oracle/diag/rdbms/orcl11g/orcl11g/incident/incdir_127465/orcl11g_ora_31804_i127465.trc ORA-00600: internal error code, arguments: [ktecgsc:kcbz_objdchk], [0], [0], [1], [], [], [], [], [], [], [], [] *** SESSION ID:(141.3959) 2012-03-02 13:38:40.815 DATA seg.obj=0, on-disk obj=27, dsflg=0, dsobj=74, cls=4
和同事确认了一下,确实有个同事在导入数据字典表,而且本来应该导入到单独的用户下,但是导入时忘了添加TOUSER,导致数据被导入到SYS用户下,造成了数据字典的不一致,从而引发了这个问题。
这里并不是要解决这个错误,事实上如果没有完整的备份,数据库已经很难恢复了。这里主要是记录这个问题,希望给其他人引以为戒。
DBA总要做一些高风险的操作,而DBA又有足够的知识和能力导致问题一般情况下难以产生的前提要求得以实现。
比如当前这次事故,如果不是DBA,没有一定专业的知识,无法使用IMP方式来导入数据字典表。而对于缺少了TOUSER参数的情况下,表会导入到SYS用户下,但是这会出现表已经存在的错误,而使得导入失败。而对于DBA来说,这种常见错误并不会引发DBA的足够重视,一个IGNORE=Y参数的使用,最终酿成了这次事故。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。