
Read Consistency in Distributed Database Based on DMVCC
Jie Shao
†§
, Boxue Yin
§
, Bujiao Chen
§
, Guangshu Wang
§
, Lin Yang
§
Jianliang Yan
§
, Jianying Wang
§
, Weidong Liu
†
†
Tsinghua University
§
Baidu,Inc
†
shao-j14@mails.tsinghua.edu.cn
†
liuwd@mail.tsinghua.edu.cn
§
{yinboxue,chenbujiao,wangguangshu,yanglin05,yanjianliang,wangjianying}@baidu.com
Abstract—In a traditional distributed database system, the
partitions use two-phase locking (2PL) as the concurrency
control protocol to ensure distributed read consistency. But
the read-lock acquired by a read operation is incompatible
with a write-lock, which undermines the performance of the
system. While in a system at the snapshot isolation level, where
partitions use Multi-Version Concurrent Control (MVCC) as
the concurrent control protocol, distributed read inconsistency
may occur. To achieve read consistency and guarantee the
performance at the same time, we propose Distributed Multi-
Version Concurrent Control (DMVCC). With DMVCC, the
system can support snapshot reads, which do not block write
operations, and ensure distributed read consistency. In this
protocol, a transaction obtains a set of consistent snapshot
version numbers at the startup time. The transaction then uses
those numbers to read the corresponding data stored on each
partition. The correctness of the protocol is strictly proved.
We conduct a series of experiments to compare the perfor-
mance of the system when using and not using DMVCC with a
scaled TPC-C benchmark. We observe that our DMVCC based
system outperforms the system using 2PL at both medium (up
to 1.53x speed up) and high contention (up to 2.0x speed up)
levels. Furthermore, when read/write ratio goes up to 1:1, the
throughput of the DMVCC based system is 290% higher than
that of the system using 2PL. The scalability of the system is
also presented.
I. INTRODUCTION
As the data increase, many large-scale services in Baidu
such as Baidu Wallet can no longer store data in a single
database. Being a Chinese counterpart of PayPal, Baidu
Wallet relies on a distributed on-line transaction processing
(OLTP) system as its storage backend. OLTP systems require
concurrency control to guarantee consistency[6], [7], so that
services running on top of them can function correctly. With-
out right concurrency control, Baidu Wallet could transfer
more money than there is from the account, execute the
transfer twice, transfer the wrong amount of money, or
present the wrong balance after a transaction.
While concurrency control is a well-studied field concern-
ing single databases, the performance of protocols such as
two-phase locking (2PL)[6] is limited with high-contention
workloads, especially when the database receives long read-
only transactions. To solve this problem, Multi-Version
Concurrency Control (MVCC)[7], [11], [14] is proposed.
For read operations, a client is allowed to read historical data
Figure 1: The throughput of a single database at different
isolation levels as the client number increases
to avoid read-write conflicts. This improve the performance
intensely[16].
To prove that, we set up a simple experiment that com-
pares the performances of a single database at different
isolation levels. In this experiment, we use MySQL[1] as our
database, which has different concurrency control methods.
2PL is used at the serializable isolation level, while MVCC
is used at the repeatable read level. TPC-C[2] is used as
our benchmark. The database contains 5 warehouses. The
experimental setup remains the same in Section V. Figure 1
shows the results:
• When the number of clients is less than 10, the perfor-
mance of the system remains almost unchanged at the
repeatable read level and at the serializable level since
there is little contention.
• As the number of clients increases, the throughput of
the database at the serializable isolation level drops
sharply. Meanwhile, the throughput of the system at the
repeatable read level almost remains the same. That is
to say the drop is caused by read and write conflicts
instead of resource limitation.
2016 IEEE 23rd International Conference on High Performance Computing
978-1-5090-5411-4/16 $31.00 © 2016 IEEE
DOI 10.1109/HiPC.2016.11
142
评论