数据库中有多种多样的锁,它可以帮我们实现需要的功能,但在正常业务中发生因为锁导致业务上的堵塞也是时有发生的。如果碰到这种情况没有预先准备好的SQL,查起来总是有些手忙脚乱。为了避免抓瞎,这里整理了一些SQL作为储备。

以下SQL都是在MySQL 5.7版本以后中使用
因DML之间发生的innodb lock
这里我们使用的是 sys.innodb_lock_waits
表,官方的介绍是记录事务中等待的innodb lock
(hese views summarize the InnoDB locks that transactions are waiting for.)
当被锁的SQL不多的时候这个表直接查询就行
mysql> select * from sys.innodb_lock_waits\G
*************************** 1. row ***************************
wait_started: 2023-09-11 22:06:29
wait_age: 00:03:43
wait_age_secs: 223
locked_table: `tb1`.`t2`
locked_table_schema: tb1
locked_table_name: t2
locked_table_partition: NULL
locked_table_subpartition: NULL
locked_index: GEN_CLUST_INDEX
locked_type: RECORD
waiting_trx_id: 2238707
waiting_trx_started: 2023-09-11 22:06:29
waiting_trx_age: 00:03:43
waiting_trx_rows_locked: 2
waiting_trx_rows_modified: 0
waiting_pid: 7221
waiting_query: update t2 set age=1112 where id=100
waiting_lock_id: 140446538434128:58:4:21:140446437302408
waiting_lock_mode: X,REC_NOT_GAP
blocking_trx_id: 2238668
blocking_pid: 7220
blocking_query: NULL
blocking_lock_id: 140446538433320:58:4:21:140446437296048
blocking_lock_mode: X
blocking_trx_started: 2023-09-11 22:05:29
blocking_trx_age: 00:04:43
blocking_trx_rows_locked: 3
blocking_trx_rows_modified: 2
sql_kill_blocking_query: KILL QUERY 7220
sql_kill_blocking_connection: KILL 7220
但是线上很少这种这么简单地情况,所以我这里稍微过滤了部分列,并加了注释
mysql> select waiting_pid waiting_pid被锁processlist_id,wait_started wait_started等待开始时间,wait_age wait_age等待时长,locked_table locked_table被锁的表,locked_index locked_index被锁的索引,waiting_query waiting_query被锁的SQL,waiting_pid waiting_pid持有锁processlist_id ,blocking_trx_started blocking_trx_started持有锁开始时间,blocking_trx_age blocking_trx_age持有锁时长 ,sql_kill_blocking_query sql_kill_blocking_query杀死查询SQL,sql_kill_blocking_connection sql_kill_blocking_connection杀死会话 from sys.innodb_lock_waits order by blocking_trx_started\G
*************************** 1. row ***************************
waiting_pid被锁processlist_id: 7221
wait_started等待开始时间: 2023-09-11 22:06:29
wait_age等待时长: 00:04:42
locked_table被锁的表: `tb1`.`t2`
locked_index被锁的索引: GEN_CLUST_INDEX
waiting_query被锁的SQL: update t2 set age=1112 where id=100
waiting_pid持有锁processlist_id: 7221
blocking_trx_started持有锁开始时间: 2023-09-11 22:05:29
blocking_trx_age持有锁时长: 00:05:42
sql_kill_blocking_query杀死查询SQL: KILL QUERY 7220
sql_kill_blocking_connection杀死会话: KILL 7220
*************************** 2. row ***************************
waiting_pid被锁processlist_id: 7222
wait_started等待开始时间: 2023-09-11 22:08:17
wait_age等待时长: 00:02:54
locked_table被锁的表: `tb1`.`t2`
locked_index被锁的索引: id_id
waiting_query被锁的SQL: update t2 set age=333 where id=100
waiting_pid持有锁processlist_id: 7222
blocking_trx_started持有锁开始时间: 2023-09-11 22:06:29
blocking_trx_age持有锁时长: 00:04:42
sql_kill_blocking_query杀死查询SQL: KILL QUERY 7221
sql_kill_blocking_connection杀死会话: KILL 7221
2 rows in set (0.00 sec)
这里的是因为 processlist_id=7220 的做了update t2 set age=11
的全表锁且不提交导致其他会话所等待。这里根据 blocking_trx_started持有锁开始时间
被锁的最早那个会话进行排序,找到最开始锁等待的会话判断持有锁的processlist_id。(上面这个情况比较特殊,因为其他两个回话的都是执行 主键id都是100的变更 ,所以即使kill 7220 ,也不会让7222回话执行成功;)
MDL锁的查询
我这里介绍的是两种情况下的 MDL锁异常。一种是全局锁,在进行备份时或手动执行 flush tables with read lock;另一种是有未提交的事务或者大查询,此时进行进行DDL操作,也会发生MDL锁等待;
全局锁的情况
我用到的是performance_schema.metadata_locks
表配合 performance_schema.threads
找到processlist_id
但是需要注意 metadata_locks
的记录需要用到 lock/metadata/sql/mdl
工具,默认是关的,因此需要将其开启:
my.cnf 中配置
开启
performance-schema-instrument='wait/lock/metadata/sql/mdl=ON'
关闭
performance-schema-instrument='wait/lock/metadata/sql/mdl=OFF'
动态修改
开启:
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME = 'wait/lock/metadata/sql/mdl';
关闭
UPDATE performance_schema.setup_instruments
SET ENABLED = 'NO', TIMED = 'NO'
WHERE NAME = 'wait/lock/metadata/sql/mdl';
查询SQL
mysql>select t.processlist_id,concat('kill ',t.processlist_id) kill_connection, t.name,t.processlist_user,t.processlist_host,t.processlist_command from performance_schema.threads t join performance_schema.metadata_locks ml on ml.owner_thread_id = t.thread_id where object_type='global' and lock_type='shared';
+----------------+-----------------+---------------------------+------------------+------------------+---------------------+
| PROCESSLIST_ID | kill_connection | NAME | PROCESSLIST_USER | PROCESSLIST_HOST | PROCESSLIST_COMMAND |
+----------------+-----------------+---------------------------+------------------+------------------+---------------------+
| 7245 | kill 7245 | thread/sql/one_connection | root | localhost | Sleep |
+----------------+-----------------+---------------------------+------------------+------------------+---------------------+
1 row in set (0.00 sec)
object_type='global' and lock_type='shared'
表示全局锁
表级别的MDL锁
一般场景是一个session 执行的dml操作没有提交,或者有慢查询 。此时进行对该表的 DDL操作 ,如果是两个回话这种简单场景下直接查询sys.schema_table_lock_waits
就能看到明显的处理办法。
mysql> select * from sys.schema_table_lock_waits stlw where waiting_pid !=blocking_pid\G
*************************** 1. row ***************************
object_schema: tb1
object_name: t2
waiting_thread_id: 9291
waiting_pid: 7245
waiting_account: root@localhost
waiting_lock_type: EXCLUSIVE
waiting_lock_duration: TRANSACTION
waiting_query: alter table t2 drop age
waiting_query_secs: 35
waiting_query_rows_affected: 0
waiting_query_rows_examined: 0
blocking_thread_id: 9295
blocking_pid: 7249
blocking_account: root@localhost
blocking_lock_type: SHARED_WRITE
blocking_lock_duration: TRANSACTION
sql_kill_blocking_query: KILL QUERY 7249
sql_kill_blocking_connection: KILL 7249
1 row in set (0.00 sec)
这里面waiting_pid !=blocking_pid
如果不用这个,这个表的信息会提示也可以kill 那个 被锁的会话(执行 alter table t2 drop age
的会话)解决问题。也许这就是解决不了问题就解决发现问题的人的意思吧
但线上会话会很多,这个SQL输出结果就乱了,你可以根据 group by sql_kill_blocking_connection
的数量 来判断哪个最多来找那个最有可能kill的会话 ;一般出现这种情况是有其他会话在执行大查询,或者 DML操作后静止没有提交,所以可以加一些条件
mysql> select stlw.* from sys.schema_table_lock_waits stlw join performance_schema.threads t on stlw.blocking_pid=t.processlist_id where waiting_pid !=blocking_pid and (t.PROCESSLIST_INFO LIKE 'SELECT%' or t.PROCESSLIST_INFO is null )\G
*************************** 1. row ***************************
object_schema: tb1
object_name: t2
waiting_thread_id: 9303
waiting_pid: 7257
waiting_account: root@localhost
waiting_lock_type: SHARED_WRITE
waiting_lock_duration: TRANSACTION
waiting_query: update t2 set age=333 where id=100
waiting_query_secs: 204
waiting_query_rows_affected: 0
waiting_query_rows_examined: 0
blocking_thread_id: 9295
blocking_pid: 7249
blocking_account: root@localhost
blocking_lock_type: SHARED_WRITE
blocking_lock_duration: TRANSACTION
sql_kill_blocking_query: KILL QUERY 7249
sql_kill_blocking_connection: KILL 7249
*************************** 2. row ***************************
object_schema: tb1
object_name: t2
waiting_thread_id: 9291
waiting_pid: 7245
waiting_account: root@localhost
waiting_lock_type: EXCLUSIVE
waiting_lock_duration: TRANSACTION
waiting_query: alter table t2 drop age
waiting_query_secs: 464
waiting_query_rows_affected: 0
waiting_query_rows_examined: 0
blocking_thread_id: 9295
blocking_pid: 7249
blocking_account: root@localhost
blocking_lock_type: SHARED_WRITE
blocking_lock_duration: TRANSACTION
sql_kill_blocking_query: KILL QUERY 7249
sql_kill_blocking_connection: KILL 7249
2 rows in set (0.00 sec)
当原因是有大查询时也可以再精确些条件比如在PROCESSLIST_INFO
中加上表的名字,这里就不总结了。
注意!
sys.schema_table_lock_waits
表是没法看到全局的MDL锁的performance_schema.metadata_locks
是没法看到表级别的 MDL锁的performance_schema.metadata_locks
需要手动开启lock/metadata/sql/mdlsys.innodb_lock_waits
没有记录MDL锁的信息
MySQL 5.6 的SQL
在MySQL5.6 中没有这么方便的视图,这里我只写了一个关于 普通DML操作的锁的SQL,因为用的不多所以总结的较少,当然也不建议用这么旧的版本。
SELECT p2.`HOST` as '被阻塞host' ,
p2.`USER` as '被阻塞用户',
r.trx_id as '被阻塞事务id',
r.trx_mysql_thread_id '被阻塞事务ID' ,
TIMESTAMPDIFF(
SECOND,
r.trx_wait_started,
CURRENT_TIMESTAMP
) as '等待时间',
r.trx_query as '被阻塞的查询',
l.lock_table as '阻塞方锁住的表',
m.`lock_mode` as '被阻塞防的锁模式', m.`lock_type` as '被阻塞方所类型,表或行锁' ,
p.`HOST` as '阻塞方主机',
p.`USER` as '阻塞方用户',
b.trx_id as '阻塞方事务ID',
b.trx_mysql_thread_id '阻塞方线程ID',
b.trx_query as '阻塞方查询' ,
l.lock_data as '阻塞方事务锁定记录的主键值',
IF (p.COMMAND = 'Sleep', CONCAT(p.TIME,' 秒'), 0) as '阻塞方事务空闲的时间'
FROM
information_schema.INNODB_LOCK_WAITS w
INNER JOIN information_schema.INNODB_TRX b ON b.trx_id = w.blocking_trx_id
INNER JOIN information_schema.INNODB_TRX r ON r.trx_id = w.requesting_trx_id
INNER JOIN information_schema.INNODB_LOCKS l ON w.blocking_lock_id = l.lock_id AND l.`lock_trx_id`=b.`trx_id`
INNER JOIN information_schema.INNODB_LOCKS m ON m.`lock_id`=w.`requested_lock_id` AND m.`lock_trx_id`=r.`trx_id`
INNER JOIN information_schema. PROCESSLIST p ON p.ID = b.trx_mysql_thread_id
INNER JOIN information_schema. PROCESSLIST p2 ON p2.ID = r.trx_mysql_thread_id
ORDER BY '等待时间'
DESC \G;




