LCL: A Lock Chain Length-based Distributed
Algorithm for Deadlock Detection and Resolution
Zhenkun Yang, Chen Qian, Xuwang Teng, Fanyu Kong, Fusheng Han, Quanqing Xu
OceanBase
OceanBaseLabs@list.alibaba-inc.com
Abstract—The problem of deadlock detection and resolution
in database systems has been studied for decades. While it has
long been a mature feature of classical centralized database
systems for many years, its use in distributed database systems
remains in its infancy. Don P. Mitchell and Michael J. Mer-
ritt (M&M) proposed a simple and fully distributed deadlock
detection and resolution algorithm, but its assumption that
each process waits on only one resource at a time prevents it
from being generally applicable. Inspired by this algorithm, we
design and implement LCL (Lock Chain Length), an elegant and
generally applicable algorithm for resource deadlock detection
and resolution in distributed environments without a restriction
of the above kind. Our extensive emulation experiments show that
the proposed approach LCL significantly outperforms the state-
of-the-art competitor M&M. In addition, it has been applied to
the OceanBase distributed relational database system, and our
extensive experiments in OceanBase illustrate that LCL is also
more efficient than M&M in deadlock detection and resolution.
Index Terms—Deadlock Detection, Transaction, Distributed
Database, Wait for Graph.
I. INTRODUCTION
Over the past decade, we have developed OceanBase [1] [2]
[3], an open-source [4], distributed relational database system.
Owing to its distributed characteristic, it recorded over 700
million tpmC on the TPC-C benchmark test in 2020 [5] and
more than 15 million QphH@30,000 GB on the TPC-H bench-
mark test in 2021 [6]. Tens of thousands of OceanBase servers
have been deployed in production systems of Alipay.com as
well as banks and other commercial business organizations.
In addition to its distributed characteristic, OceanBase [1]
supports almost all features of mainstream, classical, central-
ized relational database management systems (RDBMS), e.g.,
secondary index, view, trigger, cursor, constraints, functions,
and stored procedures. Deadlock detection in OceanBase is an
important issue, as in a classical centralized RDBMS. Com-
pared with deadlock detection in a centralized database system,
that in a distributed environment is much more challenging as
each site has only a portion of the view of the whole system [7]
[8] [9] [10]. Furthermore, even though a centralized deadlock
detection scheme is possible in a distributed environment, a
distributed scheme is much more natural and attractive owing
to performance- and complexity-related issues [10].
Mitchell and Merritt [11] proposed a simple and fully
distributed deadlock detection (M&M) algorithm. However, it
assumes that each transaction process (vertex) waits on only
one resource at a time, i.e., one outgoing edge per vertex in
the wait for graph (WFG), whereas a vertex may depend on
multiple vertices because it may wait for resource(s) occupied
by vertices in use. For example, owing to performance-
related considerations, parallel execution is critical, especially
in a distributed database system. SQL select often accesses
multiple parts or partitions of a table, and each part or partition
executed in parallel may be blocked by other transaction locks,
causing one transaction to wait for multiple locks. This above
assumption prevents it from becoming a generally applicable
deadlock detection method in a distributed environment.
In this paper, we propose LCL, an elegant, efficient and
easy-to-implement distributed algorithm that can easily detect
and resolve resource deadlocks in a distributed environment.
We make the following contributions to distributed deadlock
detection and resolution technology here:
• We propose a lock chain length (LCL)-based distributed
deadlock detection and resolution algorithm in light of
the requirements of OceanBase. LCL is a fully distributed
algorithm that consists of three stages: LCL proliferation,
spread, and detection. It is implemented and used in
OceanBase.
• We confirm that LCL has the following benefits: First,
each vertex needs neither a WFG global view nor a
local view, but only needs to send information to its
direct downstream vertices (transaction processes that
hold resources that it is waiting for), dozens of bytes each
time. Second, every deadlock can be detected and each
cycle is detected by just one vertex in it, thus simplifying
the problem of the resolution. Third, only a real deadlock
is detected and resolved, and transactions that are not
deadlocked are not rolled back.
• We provide a strict mathematical proof to show that the
LCL algorithm can easily detect and resolve deadlocks
in distributed environments.
• The results of performance evaluation based on a dis-
tributed transaction processing emulator (TPE) on a
server and OceanBase show that LCL significantly out-
performs M&M.
The remainder of this paper is organized as follows: Sec-
tion II gives definitions needed for describing and proving the
LCL algorithm. Section III describes the proposed algorithm
and Section IV provides a strict proof for it. Section V
illustrates LCL in OceanBase. Section VI provides the results
of some emulations of the algorithm, and Section VII provides
some discussion on the implementation of the algorithm and
评论