LCL：A Lock Chain Length-based Distributed Algorithm for Deadlock Detection and Resolution-OceanBase.pdf

章芋文

562

15页

21次

2023-04-27

免费下载

LCL: A Lock Chain Length-based Distributed

Algorithm for Deadlock Detection and Resolution

Zhenkun Yang, Chen Qian, Xuwang Teng, Fanyu Kong, Fusheng Han, Quanqing Xu

OceanBase

OceanBaseLabs@list.alibaba-inc.com

Abstract—The problem of deadlock detection and resolution

in database systems has been studied for decades. While it has

long been a mature feature of classical centralized database

systems for many years, its use in distributed database systems

remains in its infancy. Don P. Mitchell and Michael J. Mer-

ritt (M&M) proposed a simple and fully distributed deadlock

detection and resolution algorithm, but its assumption that

each process waits on only one resource at a time prevents it

from being generally applicable. Inspired by this algorithm, we

design and implement LCL (Lock Chain Length), an elegant and

generally applicable algorithm for resource deadlock detection

and resolution in distributed environments without a restriction

of the above kind. Our extensive emulation experiments show that

the proposed approach LCL signiﬁcantly outperforms the state-

of-the-art competitor M&M. In addition, it has been applied to

the OceanBase distributed relational database system, and our

extensive experiments in OceanBase illustrate that LCL is also

more efﬁcient than M&M in deadlock detection and resolution.

Index Terms—Deadlock Detection, Transaction, Distributed

Database, Wait for Graph.

I. INTRODUCTION

Over the past decade, we have developed OceanBase [1] [2]

[3], an open-source [4], distributed relational database system.

Owing to its distributed characteristic, it recorded over 700

million tpmC on the TPC-C benchmark test in 2020 [5] and

more than 15 million QphH@30,000 GB on the TPC-H bench-

mark test in 2021 [6]. Tens of thousands of OceanBase servers

have been deployed in production systems of Alipay.com as

well as banks and other commercial business organizations.

In addition to its distributed characteristic, OceanBase [1]

supports almost all features of mainstream, classical, central-

ized relational database management systems (RDBMS), e.g.,

secondary index, view, trigger, cursor, constraints, functions,

and stored procedures. Deadlock detection in OceanBase is an

important issue, as in a classical centralized RDBMS. Com-

pared with deadlock detection in a centralized database system,

that in a distributed environment is much more challenging as

each site has only a portion of the view of the whole system [7]

[8] [9] [10]. Furthermore, even though a centralized deadlock

detection scheme is possible in a distributed environment, a

distributed scheme is much more natural and attractive owing

to performance- and complexity-related issues [10].

Mitchell and Merritt [11] proposed a simple and fully

distributed deadlock detection (M&M) algorithm. However, it

assumes that each transaction process (vertex) waits on only

one resource at a time, i.e., one outgoing edge per vertex in

the wait for graph (WFG), whereas a vertex may depend on

multiple vertices because it may wait for resource(s) occupied

by vertices in use. For example, owing to performance-

related considerations, parallel execution is critical, especially

in a distributed database system. SQL select often accesses

multiple parts or partitions of a table, and each part or partition

executed in parallel may be blocked by other transaction locks,

causing one transaction to wait for multiple locks. This above

assumption prevents it from becoming a generally applicable

deadlock detection method in a distributed environment.

In this paper, we propose LCL, an elegant, efﬁcient and

easy-to-implement distributed algorithm that can easily detect

and resolve resource deadlocks in a distributed environment.

We make the following contributions to distributed deadlock

detection and resolution technology here:

• We propose a lock chain length (LCL)-based distributed

deadlock detection and resolution algorithm in light of

the requirements of OceanBase. LCL is a fully distributed

algorithm that consists of three stages: LCL proliferation,

spread, and detection. It is implemented and used in

OceanBase.

• We conﬁrm that LCL has the following beneﬁts: First,

each vertex needs neither a WFG global view nor a

local view, but only needs to send information to its

direct downstream vertices (transaction processes that

hold resources that it is waiting for), dozens of bytes each

time. Second, every deadlock can be detected and each

cycle is detected by just one vertex in it, thus simplifying

the problem of the resolution. Third, only a real deadlock

is detected and resolved, and transactions that are not

deadlocked are not rolled back.

• We provide a strict mathematical proof to show that the

LCL algorithm can easily detect and resolve deadlocks

in distributed environments.

• The results of performance evaluation based on a dis-

tributed transaction processing emulator (TPE) on a

server and OceanBase show that LCL signiﬁcantly out-

performs M&M.

The remainder of this paper is organized as follows: Sec-

tion II gives deﬁnitions needed for describing and proving the

LCL algorithm. Section III describes the proposed algorithm

and Section IV provides a strict proof for it. Section V

illustrates LCL in OceanBase. Section VI provides the results

of some emulations of the algorithm, and Section VII provides

some discussion on the implementation of the algorithm and

Section VIII provides a brief review of related work on

distributed deadlock detection and resolution technologies.

Section IX summarizes the conclusions of this study.

II. DEFINITIONS

In this section, we give some deﬁnitions.

A. Basic Deﬁnitions

Deﬁnition II.1 (A → B). A → B represents an edge from

vertex A to vertex B, which means vertex A (transaction

process) is waiting for some resource being held by vertex

B (transaction process). A is then said to be immediately

upstream of B and B is said to be immediately downstream of

A. A(6=B) → B is used for A 6= B and A → B. Because it is

trivial to ﬁnd and ﬁx a resource deadlock inside a transaction

process, we assume there are no such edges (loops). For

convenience of discussion, we denote A → A.

Deﬁnition II.2 (A ⇒ B). For two vertices A and B, A ⇒ B

if there exists a walk [12] from A to B. A is said to be an

upstream vertex of B and B is said to be a downstream vertex

of A, and A to B are said to be one-way connected. If A ⇒ B

and B ⇒ C, then A ⇒ C. For convenience of discussion, we

denote A ⇒ A.

Deﬁnition II.3 (A ⇔ B). For two vertices A and B, A ⇔ B

if both A ⇒ B and B ⇒ A. A and B are then said to be

two-way connected. If A ⇔ B, then B ⇔ A. If A ⇔ B

and B ⇔ C, then A ⇔ C. It is clear that any two vertices

in a cycle [12] are two-way connected. For convenience of

discussion, we denote A ⇔ A.

Deﬁnition II.4 (hx, yi). hx, yi is an integer with x as its high-

order bits and y as its low-order bits.

Deﬁnition II.5 (Dist(A, B)). This is only meaningful for

A ⇒ B. Dist(A, B):= n. If there exists a walk of length

n from A to B, then, for any walk of length m from A to B,

n ≤ m. Dist(A, A) == 0. If A(6=B) → B, then Dist(A, B)

== 1.

B. Annulus subgraph distance

The strongly connected component [13] SCC(A) indicates

the SCC at which A is located, and has the following proper-

ties. Suppose scc

and scc

are two SCCs:

1) If scc

∩ scc

6= ∅, then scc

== scc

2) If there is X ∈ scc

, Y ∈ scc

, and X ⇔ Y , then

scc

== scc

3) If X ⇔ Y , then SCC(X) == SCC(Y ). So, for a

cycle cyc, SCC(cyc) is used to represent the strongly

connected components containing cyc.

4) If X and Y are two vertices in scc

, then any walk

from X to Y is also in scc

5) If X is not in any cycle, SCC(X) is trivial, i.e.,

SCC(X) == {X}.

Deﬁnition II.6 (USG(scc)). The upstream subgraph (USG)

of scc. USG(scc) = (V S, ES), where V S = {X ∈ W F G :

X ⇒ A, A ∈ scc}, ES = {X → Y : X ∈ V S and Y ∈ V S

and X → Y ∈ W F G}.

Suppose A ∈ scc

and B ∈ scc

. If A ⇒ B, then

USG(scc

) ⊆ USG(scc

); if A ⇔ B, then USG(scc

) ==

USG(scc

). If X is a vertex in U SG(scc

), then any walk

from some vertex to X is also in USG(scc

). Clearly,

scc

⊆ USG(scc

Deﬁnition II.7 (ASG(scc)). The annulus subgraph (ASG) of

scc (as shown in Figure 1). ASG(scc) = (V S, ES), where

V S = {X ∈ W F G : X ∈ USG(scc) and X /∈ scc}, ES =

{X → Y : X ∈ V S and Y ∈ V S and X → Y ∈ W FG}.

$6*VFF

& $

$VJ'LVW%VFF

VFF

Fig. 1. SCC, ASG and AsgDist in WFG

Suppose A ∈ scc

and B ∈ scc

. If A ⇔ B, then

ASG(scc

) == ASG(scc

). If we think that USG(scc

) is

an outer circle and scc

is an inner circle, then ASG(scc

) is

the annulus.

Deﬁnition II.8 (AsgDist(A, scc)). For A ∈ ASG(scc),

AsgDist(A, scc) := max{m : There exists m distinct vertices

(==A), C

m−1

, . . . , C

in ASG(scc) and a vertex C

∈ scc

such that C

(==A) → C

m−1

→. . . →C

→C

For A ∈ scc, AsgDist (A, scc) := 0. If A is in the annulus

ASG(scc), the length is a positive number; if A is in the

inner circle scc, the distance is zero. As shown in Figure 1,

AsgDist(B, scc) = 4.

III. LCL ALGORITHM

In this section, we give the detailed LCL algorithm.

A. Key idea of LCL

LCL is a fully distributed deadlock detection and resolution

algorithm in OceanBase [1], as shown in Algorithm 1. All

messages in LCL have a ﬁxed length (tens of bytes), and

each vertex sends messages only to its immediate downstream

vertices, i.e., transaction processes that hold the resources for

which it is waiting, and only one deadlocked process detects

and resolves the deadlock in each deadlocked cycle. In LCL,

each vertex requires neither a global view nor even a local

view in WFG. We present the following deﬁnitions:

Deﬁnition III.1 (AsgWidth(scc)). AsgWidth(scc) :=

max{AsgDist(X, scc) : X ∈ USG(scc)}. AsgWidth(scc) can

be considered to be the width of ASG(scc), the annulus.

Deﬁnition III.2 (PrivateID, PrID). The PrID of a vertex is a

ﬁxed constant integer, and is unique for each vertex.

of 15

免费下载

关注

评论