问题描述
环境介绍:
11.2.0.3 rac-single adg,linux环境
1 alert日志报错信息
Errors in file /oracle/app/oracle/diag/rdbms/htz/htz/trace/htz_lgwr_21296.trc: ORA-04020: deadlock detected while trying to lock object SYS.crmadb Mon Oct 20 02:05:33 2014 MRP0: Background Media Recovery terminated with error 449 Errors in file /oracle/app/oracle/diag/rdbms/htz/htz/trace/htz_mrp0_56025.trc: ORA-00449: background process ‘LGWR’ unexpectedly terminated with error 4020 Managed Standby Recovery not using Real Time Apply LGWR (ospid: 21296): terminating the instance due to error 4020
专家解答
通过上面信息,可以得到LGWR因为ORA-04020错误被终止,导致数据库实例被异常关闭
查看htz_lgwr_21296.trc日志文件信息
A deadlock among DDL and parse locks is detected. This deadlock is usually due to user errors in the design of an application or from issuing a set of concurrent statements which can cause a deadlock. This should not be reported to Oracle Support. The following information may aid in finding the errors which cause the deadlock: ORA-04020: deadlock detected while trying to lock object SYS.htzadb ——————————————————– object waiting waiting blocking blocking handle session lock mode session lock mode ——– ——– ——– —- ——– ——– —- 0xf57d98d10 0xf72e00ed0 0xdd47a1ff8 S 0xf8a820b50 0xf57dd26f8 0 0xdb6f9ffa0 0xf82226890 0xdbd7714f0 S 0xf72e00ed0 0xe6a20c450 X 0xf57d98d10 0xf8a820b50 0xf57dd26f8 X 0xf82226890 0xde8de91e8 S 0xf8a820b50被0xf82226890阻塞 0xf82226890被0xf72e00ed0阻塞 0xf72e00ed0被0xf8a820b50阻塞 这里出现了死锁 下面看看每一个对象的名字 ——————————————————– ———- DUMP OF WAITING AND BLOCKING LOCKS ———- ——————————————————– ————- WAITING LOCK ————- —————————————- SO: 0xdd47a1ff8, type: 78, owner: 0xf086a2548, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xf699e18f8, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0 LibraryObjectLock: Address=0xdd47a1ff8 Handle=0xf57d98d10 RequestMode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 User=0xf72e00ed0 Session=0xf6adebea8 ReferenceCount=0 Flags=[0000] SavepointNum=316 LibraryHandle: Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0 ObjectName: Name=SYS.htzdb FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0 Statistics: InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0 Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0 Concurrency: DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0) Flags=RON/PIN/KEP/BSO/[00810003] WaitersLists: Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8] Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80] LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8] Timestamp: HandleReference: Address=0xf57d98eb8 Handle=(nil) Flags=[00] ————- BLOCKING LOCK ———— —————————————- SO: 0xf57dd26f8, type: 78, owner: 0xf72f836e8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xf79a35860, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0 LibraryObjectLock: Address=0xf57dd26f8 Handle=0xf57d98d10 RequestMode=X CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 User=0xf8a820b50 Session=0xf8a820b50 ReferenceCount=0 Flags=[0100] SavepointNum=2 LibraryHandle: Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0 ObjectName: Name=SYS.htzdb FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0 Statistics: InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0 Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0 Concurrency: DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0) Flags=RON/PIN/KEP/BSO/[00810003] WaitersLists: Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8] Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80] LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8] Timestamp: HandleReference: Address=0xf57d98eb8 Handle=(nil) Flags=[00] ————- WAITING LOCK ————- —————————————- SO: 0xdbd7714f0, type: 78, owner: 0xf086b0190, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xf899dd678, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0 LibraryObjectLock: Address=0xdbd7714f0 Handle=0xdb6f9ffa0 RequestMode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 User=0xf82226890 Session=0xf7a2af120 ReferenceCount=0 Flags=[0000] SavepointNum=8c4af LibraryHandle: Address=0xdb6f9ffa0 Hash=32cc0539 LockMode=X PinMode=0 LoadLockMode=0 Status=0 ObjectName: Name=$BUILD$. 969f67910439dee FullHashValue=8bef6dd99c7e1680af8c2d9432cc0539 Namespace=SQL AREA BUILD(82) Type=CURSOR(00) Identifier=0 OwnerIdn=0 Statistics: InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=57 TotalPinCount=0 Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=59 HandleInUse=59 HandleReferenceCount=0 Concurrency: DependencyMutex=0xdb6fa0050(0, 0, 0, 0) Mutex=0xdb6fa00d0(0, 229, 1, 0) Flags=RON/PIN/[00010000] WaitersLists: Lock=0xdb6fa0030[0xdbd771560,0xdbd771560] Pin=0xdb6fa0010[0xdb6fa0010,0xdb6fa0010] LoadLock=0xdb6fa0088[0xdb6fa0088,0xdb6fa0088] Timestamp: HandleReference: Address=0xdb6fa0150 Handle=(nil) Flags=[00] ————- BLOCKING LOCK ———— —————————————- SO: 0xe6a20c450, type: 78, owner: 0xf086a2548, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xf699e18f8, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0 LibraryObjectLock: Address=0xe6a20c450 Handle=0xdb6f9ffa0 Mode=X CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 User=0xf72e00ed0 Session=0xf6adebea8 ReferenceCount=1 Flags=CNB/[0001] SavepointNum=316 LibraryHandle: Address=0xdb6f9ffa0 Hash=32cc0539 LockMode=X PinMode=0 LoadLockMode=0 Status=0 ObjectName: Name=$BUILD$. 969f67910439dee FullHashValue=8bef6dd99c7e1680af8c2d9432cc0539 Namespace=SQL AREA BUILD(82) Type=CURSOR(00) Identifier=0 OwnerIdn=0 Statistics: InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=57 TotalPinCount=0 Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=59 HandleInUse=59 HandleReferenceCount=0 Concurrency: DependencyMutex=0xdb6fa0050(0, 0, 0, 0) Mutex=0xdb6fa00d0(0, 229, 1, 0) Flags=RON/PIN/[00010000] WaitersLists: Lock=0xdb6fa0030[0xdbd771560,0xdbd771560] Pin=0xdb6fa0010[0xdb6fa0010,0xdb6fa0010] LoadLock=0xdb6fa0088[0xdb6fa0088,0xdb6fa0088] Timestamp: HandleReference: Address=0xdb6fa0150 Handle=(nil) Flags=[00] ————- WAITING LOCK ————- —————————————- SO: 0xf57dd26f8, type: 78, owner: 0xf72f836e8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xf79a35860, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0 LibraryObjectLock: Address=0xf57dd26f8 Handle=0xf57d98d10 RequestMode=X CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 User=0xf8a820b50 Session=0xf8a820b50 ReferenceCount=0 Flags=[0100] SavepointNum=2 LibraryHandle: Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0 ObjectName: Name=SYS.htzdb FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0 Statistics: InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0 Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0 Concurrency: DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0) Flags=RON/PIN/KEP/BSO/[00810003] WaitersLists: Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8] Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80] LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8] Timestamp: HandleReference: Address=0xf57d98eb8 Handle=(nil) Flags=[00] ————- BLOCKING LOCK ———— —————————————- SO: 0xde8de91e8, type: 78, owner: 0xf086e3620, flag: INIT/-/-/0x00 if: 0x3 c: 0x3 proc=0xf899dd678, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0 LibraryObjectLock: Address=0xde8de91e8 Handle=0xf57d98d10 Mode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0 User=0xf82226890 Session=0xf82226890 ReferenceCount=1 Flags=CNB/[0001] SavepointNum=8b0a9 LibraryHandle: Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0 ObjectName: Name=SYS.htzdb FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0 Statistics: InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0 Counters: BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0 Concurrency: DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0) Flags=RON/PIN/KEP/BSO/[00810003] WaitersLists: Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8] Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80] LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8] Timestamp: HandleReference: Address=0xf57d98eb8 Handle=(nil) Flags=[00] ——————————————————– This lock request was aborted. error 4020 detected in background process ORA-04020: deadlock detected while trying to lock object SYS.htzdb kjzduptcctx: Notifying DIAG for crash event —– Abridged Call Stack Trace —– ksedsts()+461<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+53<-ksuitm()+1332<-ksbrdp()+3344<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+266<-ssthrdmain()+252<-main()+201<-__libc_start_main()+253<-_start()+36 —– End of Abridged Call Stack Trace —– *** 2014-10-20 02:05:33.948 LGWR (ospid: 21296): terminating the instance due to error 4020 ksuitm: waiting up to [5] seconds before killing DIAG(21282)
可以参数MOS:ORA-04020 Deadlocks: Most Common Causes (文档 ID 166924.1)里面有详细的介绍怎么阅读此trace文件
2 查询MOS信息
通过MOS可以发现下面的BUG信息
Bug 17292239 – Active Dataguard Hangs waiting for library cache lock on DBINSTANCE. Tracking bug to deliver backports to remove fix for 11664426 (Doc ID 17292239.8) Bug 17018214 – ORA-600 [krdrsb_end_qscn_2] ORA-4021 in Active Dataguard Standby Database with fix for bug 16717701 present – Instance may crash (Doc ID 17018214.8) Bug 16717701 – Active Dataguard Hangs waiting for library cache lock on DBINSTANCE namespace with possible deadlock – Superseded (Doc ID 16717701.8) Bug 11664426 – Query hangs with “kksfbc child completion” wait – superseded (Doc ID 11664426.8)
在17292239中,提到有如下的描述,在主库发现大量的DDL语句
A Standby Active Dataguard database may encounter severe library cache lock contention with possible deadlock causing the database to hang when there are a lot of DDLs on the Primary Database.
在17018214中,提到有如下的描述
Note: For interim patches please use this fix instead of 16717701 in order to address both issues. This fix also contains the fix of bug 17292239; it is recommended then to apply this fix alone instead of a Merged fix between 17018214 and 17292239. *** IMPORTANT *** This fix is DISABLED by default in ALL releases. In 11.2.0.4 the fix is controlled by hidden parameter “_adg_parselock_timeout=<value in centiseconds>. For backports/one-off patches, the fix must be enabled by setting event 16717701 at level 104887600. The value of the event encodes two things: 1. Timeout – this is the amount of time that LGWR will wait for an X lock before signaling(发出信号) a timeout error and retrying. 2. Sleep duration – this is the amount of time that LGWR will sleep for after having timed out. Both of the above values are denoted(表明) in MILLISECONDS. The event value encodes the sleep duration in its 12 high-order bits and the timeout in the 20 lower order bits. The value can be calculated using the following formula: value = (S * 1048576) + T where S = sleep duration in milliseconds T = timeout in milliseconds Although these values may need some fine tuning for each case, we recommend that all customers initially start with a 30 second timeout and 100 ms sleep duration. In other words, LGWR will wait for up to 30 seconds to acquire the X-lock. If it fails, then it will retry after sleeping for 100 milliseconds. Using the above formula, the event value would then have to be set to: value = (100 * 1048576) + 30000 = 104887600 ** In a RAC ADG configuration, the event must have the same value on all instances. It may not be set or altered in a rolling manner ***
可以通过打补丁并配置event来解决
3 查询DDL语句
通过查询audit_ddl_obj可以统计ddl语句
TO_CHAR(OPR_TIME COUNT(*) —————- ———- 2014-10-20 01:38 4 2014-10-20 01:39 3 2014-10-20 01:43 2 2014-10-20 01:44 3 2014-10-20 01:45 6 2014-10-20 01:48 4 2014-10-20 01:49 3 2014-10-20 01:53 2 2014-10-20 01:54 3 2014-10-20 01:58 4 2014-10-20 01:59 3 2014-10-20 02:00 17 2014-10-20 02:01 5 2014-10-20 02:03 7 2014-10-20 02:04 53 2014-10-20 02:05 10 2014-10-20 02:06 15 2014-10-20 02:07 20 2014-10-20 02:08 4 2014-10-20 02:09 8 2014-10-20 02:13 2 2014-10-20 02:14 3 SQL> select sid, count(*) 2 from system.audit_ddl_obj 3 where to_char(opr_time, ‘yyyy-mm-dd hh24:mi’) = ‘2014-10-20 02:04’ 4 group by sid 5 order by 2 6 ; SID COUNT(*) ———- ———- 4959 3 8028 50 SQL> select USER_NAME, ddl_type, ddl_sql, object_type, object_name 2 from system.audit_ddl_obj 3 where to_char(opr_time, ‘yyyy-mm-dd hh24:mi’) = ‘2014-10-20 02:04’ 4 and sid = 8028 5 ; USER_NAME DDL_TYPE DDL_SQL OBJECT_TYPE OBJECT_NAME —————————— ——————– ———- —————— —————————— SYS CREATE TABLE ORA_TEMP_2_DS_1531250 SYS GRANT OBJECT PRIVILEGE ORA_TEMP_2_DS_1531250 SYS ALTER TABLE ORA_TEMP_2_DS_1531250 SYS TRUNCATE TABLE ORA_TEMP_2_DS_1531250 SYS DROP TABLE ORA_TEMP_2_DS_1531250 SYS CREATE TABLE ORA_TEMP_2_DS_1531251 SYS GRANT OBJECT PRIVILEGE ORA_TEMP_2_DS_1531251 SYS ALTER TABLE ORA_TEMP_2_DS_1531251 SYS TRUNCATE TABLE ORA_TEMP_2_DS_1531251
通过查询,发现是由于自动统计信息导致的大量DDL语句
4,故障分析总结
通过上面分析,可以确认是由于主库大量的DDL语句使备库触发BUG17292239,通过打补丁并配置event可以解决。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。