以前一个客户今天联系我,有个死锁的问题,如下的客户提供的信息,看上去有点意思,如下:
可以看到这是同一个表的不同的2行数据,在进行DML操作的时候出现了死锁。下面我们来看下Mode是多少:
这里的event是enq: TX - row lock contention,而54580006其实是分为2部分,前面是Name,后面的0006是mode。
从执行计划来看,出现了INLIST ITERATOR操作,这是一个迭代操作。说明sql是根据or或in操作有关系,这一点从前面的SQL可以看出来。
下面我在自己的10.2.0.4环境进行模拟一下,再现这个问题。
我们可以看到,成功模拟出了这个ora-00600死锁的情况,下我们来看下trace 文件的内容:
接着我们来看下这个process的dump信息:
搜索Plan关键字,可以定位到该SQL的执行计划:
我们可以看到执行计划中出现了INLIST ITERATOR。至于Or的情况,大家可以自行模拟。不过我这里模拟的情况和客户的实际情况
还是有一点细微差异,他这里的执行计划中还有一个回表的操作:TABLE ACCESS BY INDEX ROWID
不过,这个模拟已经可以说明问题了。对于enq: TX - row lock contention的死锁,如果Mode=6,那么其实只有这样一样情况。
这里需要我们注意的是,有些人以为这里是row lock contention,那么肯定是因为Index的关系,其实这是错误的理解。
对于row的操作,Oracle首先会申请一个表级别的TM锁,然后再申请Tx. 这里是否存在Index其实一点关系都没有.
为了说明这一点,我drop掉Index,一样会出现这个死锁,如下:
其中process dump如下:
此时的SQL执行计划肯定是全表扫描了,如下:
说了这么多,对于这种enq: TX - row lock contention 死锁,如果mode=6,那么唯一的解决方法就是kill 会话或者调整业务逻辑。
如果你遇到的enq: TX - row lock contention 死锁的mode为4即使shared mode,那么请检查如下几个方面是否存在问题:
1) ITL争用,建议调整INITRANS
2)涉及unique Index,那么说明index可能涉及到重复键值
3)检查是否用了位图Index。
如果是TM死锁,那么通常是外键缺乏Index导致。
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-000a001e-0001720c 35 105 X 41 142 X
TX-00030028-0000c5bd 41 142 X 35 105 X
session 105: DID 0001-0023-0003A7FB session 142: DID 0001-0029-0018FFCA
session 142: DID 0001-0029-0018FFCA session 105: DID 0001-0023-0003A7FB
Rows waited on:
Session 142: obj - rowid = 0000CE48 - AAAM5IAAHAAC0fyAAC
(dictionary objn - 52808, file - 7, block - 739314, slot - 2)
Session 105: obj - rowid = 0000CE48 - AAAM5IAAHAAC0fzAAo
(dictionary objn - 52808, file - 7, block - 739315, slot - 40)
Information on the OTHER waiting sessions:
Session 142:
pid=41 serial=37613 audsid=4133092 user: 73/B1xxxxMANAGER
O/S info: user: Administrator, term: xxxxAPP1, ospid: 3736:920, machine: WORKGROUP\xxxxAPP1
program: ISETDA.xxxx.Server.APPServer.exe
application name: ISETDA.xxxx.Server.APPServer.exe, hash value=1815860971
Current SQL Statement:
UPDATE MMSPLT SET LOC = :0 , CHANGED_TIME = :1 , WORKER = :2 WHERE (PLT = :3 )
End of information on OTHER waiting sessions.
Current SQL statement for this session:
UPDATE MMSPLT SET interface_yn = 'R' WHERE erpsnd in ('1','2') AND interface_yn = 'N'
===================================================
可以看到这是同一个表的不同的2行数据,在进行DML操作的时候出现了死锁。下面我们来看下Mode是多少:
last wait for 'enq: TX - row lock contention' blocking sess=0x0x139250a60 seq=59 wait_time=2930837 seconds since wait started=2
name|mode=54580006, usn<<16 | slot=30028, sequence=c5bd
Dumping Session Wait History
for 'enq: TX - row lock contention' count=1 wait_time=2930837
name|mode=54580006, usn<<16 | slot=30028, sequence=c5bd
这里的event是enq: TX - row lock contention,而54580006其实是分为2部分,前面是Name,后面的0006是mode。
============
Plan Table
============
-----------------------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------+-----------------------------------+
| 0 | UPDATE STATEMENT | | | | 16 | |
| 1 | UPDATE | MMSPLT | | | | |
| 2 | INLIST ITERATOR | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID | MMSPLT | 86 | 430 | 16 | 00:00:01 |
| 4 | INDEX RANGE SCAN | MMSPLT_IDX02| 124 | | 4 | 00:00:01 |
-----------------------------------------------------+-----------------------------------+
从执行计划来看,出现了INLIST ITERATOR操作,这是一个迭代操作。说明sql是根据or或in操作有关系,这一点从前面的SQL可以看出来。
下面我在自己的10.2.0.4环境进行模拟一下,再现这个问题。
SQL> create table t_deadlock as select owner,object_id,object_name
2 from dba_objects where object_id < 5;
Table created.
SQL> select count(1) from t_deadlock;
COUNT(1)
----------
3
SQL> alter table t_deadlock minimize records_per_block;
Table altered.
SQL> insert into t_deadlock select owner,object_id,object_name
2 from dba_objects where object_id > 5 and object_id < 15;
9 rows created.
SQL> commit;
Commit complete.
SQL> select object_id,dbms_rowid.rowid_object(rowid) obj#,
2 dbms_rowid.rowid_relative_fno(rowid) rfile#,
3 dbms_rowid.rowid_block_number(rowid) block#,
4 dbms_rowid.rowid_row_number(rowid) row#
5 from t_deadlock order by 4,5;
OBJECT_ID OBJ# RFILE# BLOCK# ROW#
---------- ---------- ---------- ---------- ----------
4 55870 5 2084 0
3 55870 5 2084 1
2 55870 5 2084 2
11 55870 5 2085 0
10 55870 5 2085 1
9 55870 5 2085 2
8 55870 5 2086 0
7 55870 5 2086 1
6 55870 5 2086 2
14 55870 5 2088 0
13 55870 5 2088 1
12 55870 5 2088 2
12 rows selected.
下面分别开2个会话窗口来进行模拟:
Session 1:
SQL> select sid from v$Mystat where rownum < 2;
SID
----------
159
SQL> update t_deadlock set owner='killdb' where object_id in(5,6,7);
2 rows updated.
Session 2:
SQL> select sid from v$mystat where rownum < 2;
SID
----------
149
SQL> rollback;
Rollback complete.
SQL> update t_deadlock set owner='xxoo' where object_id in(9,10,11);
4 rows updated.
Ssession 1:
SQL> update t_deadlock set owner='xxoo' where object_id in(9,10,11);
update t_deadlock set owner='xxoo' where object_id in(8,9,10,11)
*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
Session 2:
SQL> update t_deadlock set owner='killdb' where object_id in(5,6,7);
.....一直等待
我们可以看到,成功模拟出了这个ora-00600死锁的情况,下我们来看下trace 文件的内容:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-00020013-0000024b 15 159 X 18 149 X
TX-00080019-00000351 18 149 X 15 159 X
session 159: DID 0001-000F-00000012 session 149: DID 0001-0012-0000000E
session 149: DID 0001-0012-0000000E session 159: DID 0001-000F-00000012
Rows waited on:
Session 149: obj - rowid = 0000DA3E - AAANo+AAFAAAAgmAAC
(dictionary objn - 55870, file - 5, block - 2086, slot - 2)
Session 159: obj - rowid = 0000DA3E - AAANo+AAFAAAAglAAC
(dictionary objn - 55870, file - 5, block - 2085, slot - 2)
Information on the OTHER waiting sessions:
Session 149:
pid=18 serial=16 audsid=320188 user: 58/ROGER
O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 2188:4172, machine: WORKGROUP\ROGER-PC
program: sqlplus.exe
application name: SQL*Plus, hash value=3669949024
Current SQL Statement:
update t_deadlock set owner='killdb' where object_id in(5,6,7)
End of information on OTHER waiting sessions.
Current SQL statement for this session:
update t_deadlock set owner='xxoo' where object_id in(9,10,11)
===================================================
接着我们来看下这个process的dump信息:
----------------------------------------
SO: 000007FF1EB84E20, type: 4, owner: 000007FF1EA62988, flag: INIT/-/-/0x00
(session) sid: 159 trans: 000007FF1D17C3E8, creator: 000007FF1EA62988, flag: (100041) USR/- BSY/-/-/-/-/-
DID: 0001-000F-00000012, short-term DID: 0000-0000-00000000
txn branch: 0000000000000000
oct: 6, prv: 0, sql: 000007FF1E552258, psql: 000007FF19EAB960, user: 58/ROGER
service name: SYS$USERS
O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 7176:7164, machine: WORKGROUP\ROGER-PC
program: sqlplus.exe
application name: SQL*Plus, hash value=3669949024
last wait for 'enq: TX - row lock contention' blocking sess=0x000007FF1EB77810 seq=128 wait_time=2999550 seconds since wait started=12
name|mode=54580006, usn<<16 | slot=20025, sequence=24b
Dumping Session Wait History
for 'enq: TX - row lock contention' count=1 wait_time=2999550
name|mode=54580006, usn<<16 | slot=20025, sequence=24b
for 'enq: TX - row lock contention' count=1 wait_time=3000600
name|mode=54580006, usn<<16 | slot=20025, sequence=24b
for 'enq: TX - row lock contention' count=1 wait_time=3000050
name|mode=54580006, usn<<16 | slot=20025, sequence=24b
for 'enq: TX - row lock contention' count=1 wait_time=2999524
name|mode=54580006, usn<<16 | slot=20025, sequence=24b
.........
搜索Plan关键字,可以定位到该SQL的执行计划:
============
Plan Table
============
----------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------+-----------------------------------+
| 0 | UPDATE STATEMENT | | | | 4 | |
| 1 | UPDATE | T_DEADLOCK| | | | |
| 2 | INLIST ITERATOR | | | | | |
| 3 | INDEX RANGE SCAN | IDX_ID_T | 9 | 270 | 2 | 00:00:01 |
----------------------------------------+-----------------------------------+
我们可以看到执行计划中出现了INLIST ITERATOR。至于Or的情况,大家可以自行模拟。不过我这里模拟的情况和客户的实际情况
还是有一点细微差异,他这里的执行计划中还有一个回表的操作:TABLE ACCESS BY INDEX ROWID
不过,这个模拟已经可以说明问题了。对于enq: TX - row lock contention的死锁,如果Mode=6,那么其实只有这样一样情况。
这里需要我们注意的是,有些人以为这里是row lock contention,那么肯定是因为Index的关系,其实这是错误的理解。
对于row的操作,Oracle首先会申请一个表级别的TM锁,然后再申请Tx. 这里是否存在Index其实一点关系都没有.
SID TY ID1 ID2 LOCK_TYPE REQUEST CTIME BLOCK
---------- -- ---------- ---------- ------------------- ---------- ---------- ----------
149 TX 262168 606 None 6 156 0
159 TM 55870 0 Row Exclusive 0 195 0
149 TM 55870 0 Row Exclusive 0 189 0
149 TX 131109 587 Exclusive 0 189 0
159 TX 262168 606 Exclusive 0 195 1
为了说明这一点,我drop掉Index,一样会出现这个死锁,如下:
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-00060008-00000267 18 149 X 15 159 X
TX-0008000c-00000350 15 159 X 18 149 X
session 149: DID 0001-0012-0000000A session 159: DID 0001-000F-0000000F
session 159: DID 0001-000F-0000000F session 149: DID 0001-0012-0000000A
Rows waited on:
Session 159: obj - rowid = 0000DA3E - AAANo+AAFAAAAglAAA
(dictionary objn - 55870, file - 5, block - 2085, slot - 0)
Session 149: obj - rowid = 0000DA3E - AAANo+AAFAAAAgmAAB
(dictionary objn - 55870, file - 5, block - 2086, slot - 1)
Information on the OTHER waiting sessions:
Session 159:
pid=15 serial=5 audsid=310187 user: 58/ROGER
O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 19232:14592, machine: WORKGROUP\ROGER-PC
program: sqlplus.exe
application name: SQL*Plus, hash value=3669949024
Current SQL Statement:
update t_deadlock set owner='google' where object_id >6 and object_id < 13
End of information on OTHER waiting sessions.
Current SQL statement for this session:
update t_deadlock set owner='google' where object_id >6 and object_id < 13
===================================================
其中process dump如下:
----------------------------------------
SO: 000007FF1EB77810, type: 4, owner: 000007FF1EA64200, flag: INIT/-/-/0x00
(session) sid: 149 trans: 000007FF1D157328, creator: 000007FF1EA64200, flag: (41) USR/- BSY/-/-/-/-/-
DID: 0001-0012-0000000A, short-term DID: 0000-0000-00000000
txn branch: 0000000000000000
oct: 6, prv: 0, sql: 000007FF19DE9768, psql: 000007FF19F9D9E0, user: 58/ROGER
service name: SYS$USERS
O/S info: user: Roger-PC\Roger, term: ROGER-PC, ospid: 15668:4172, machine: WORKGROUP\ROGER-PC
program: sqlplus.exe
application name: SQL*Plus, hash value=3669949024
last wait for 'enq: TX - row lock contention' blocking sess=0x000007FF1EB84E20 seq=25 wait_time=3000153 seconds since wait started=244
name|mode=54580006, usn<<16 | slot=8000c, sequence=350
Dumping Session Wait History
for 'enq: TX - row lock contention' count=1 wait_time=3000153
name|mode=54580006, usn<<16 | slot=8000c, sequence=350
for 'enq: TX - row lock contention' count=1 wait_time=3000073
name|mode=54580006, usn<<16 | slot=8000c, sequence=350
此时的SQL执行计划肯定是全表扫描了,如下:
============
Plan Table
============
----------------------------------------+-----------------------------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time |
----------------------------------------+-----------------------------------+
| 0 | UPDATE STATEMENT | | | | 4 | |
| 1 | UPDATE | T_DEADLOCK| | | | |
| 2 | TABLE ACCESS FULL | T_DEADLOCK| 6 | 180 | 4 | 00:00:01 |
----------------------------------------+-----------------------------------+
说了这么多,对于这种enq: TX - row lock contention 死锁,如果mode=6,那么唯一的解决方法就是kill 会话或者调整业务逻辑。
如果你遇到的enq: TX - row lock contention 死锁的mode为4即使shared mode,那么请检查如下几个方面是否存在问题:
1) ITL争用,建议调整INITRANS
2)涉及unique Index,那么说明index可能涉及到重复键值
3)检查是否用了位图Index。
如果是TM死锁,那么通常是外键缺乏Index导致。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




