
X-SSD: A Storage System with Native Support for
Database Logging and Replication
Sangjin Lee
∗
Hanyang University
Republic of Korea
Alberto Lerner
University of Fribourg
Switzerland
André Ryser
University of Fribourg
Switzerland
Kibin Park
Hanyang University
Republic of Korea
Chanyoung Jeon
Hanyang University
Republic of Korea
Jinsub Park
Hanyang University
Republic of Korea
Yong Ho Song
Hanyang University &
Samsung Electronics
Republic of Korea
Philippe
Cudré-Mauroux
University of Fribourg
Switzerland
ABSTRACT
Transaction logging and log shipping are standard techniques to
provide recoverability and high availability in data management
systems. They entail an update to a local log le and a remote site at
every transaction. Modern databases have leveraged technologies
such as Persistent Memory (PM) and RDMA-enabled networking
to perform these updates as fast as possible. This mix of technolo-
gies, however, presents several drawbacks: lack of portability, the
complexity of the data path, and interoperability.
To address these issues, this paper introduces the X-SSD, a new
SSD architecture that mixes NAND Flash and PM memory classes.
A X-SSD device can take transaction log writes on a fast, PM-backed
data path and be responsible for propagating the operation to re-
mote sites and eventually to NAND Flash storage. We design and
implement an actual reference X-SSD device called Villars to vali-
date this new architecture. Our experiments show that the Villars
device can oer a more straightforward and robust way to manage
PM on behalf of the database and achieve equally fast results.
CCS CONCEPTS
• Information systems
→
Database management system en-
gines; Storage architectures.
KEYWORDS
database-storage codesign, write-ahead log, database replication
ACM Reference Format:
Sangjin Lee, Alberto Lerner, André Ryser, Kibin Park, Chanyoung Jeon,
Jinsub Park, Yong Ho Song, and Philippe Cudré-Mauroux. 2022. X-SSD: A
Storage System with Native Support for Database Logging and Replication.
In Proceedings of the 2022 International Conference on Management of Data
(SIGMOD ’22), June 12–17, 2022, Philadelphia, PA, USA. ACM, New York, NY,
USA, 15 pages. https://doi.org/10.1145/3514221.3526188
∗
The author performed most of the work while visiting the University of Fribourg.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
SIGMOD ’22, June 12–17, 2022, Philadelphia, PA, USA
© 2022 Association for Computing Machinery.
ACM ISBN 978-1-4503-9249-5/22/06... $15.00
https://doi.org/10.1145/3514221.3526188
1 INTRODUCTION
Database replication is often performed by copying transactions’
changes into a secondary site before committing these changes
into local storage [
41
,
62
]. Such a mechanism is called (transaction)
log shipping and is present almost universally in databases that
oer replication, (e.g., [
3
,
56
,
63
]). If the primary database site goes
down, the secondary one can serve as a hot backup, as it caught
up with all the primary database changes. Achieving this level of
robustness, however, comes at a cost. Transaction logging and log
shipping require writing to storage and exchanging data over the
network, both relatively expensive operations.
Two technologies reached maturity recently that can be rele-
vant in this scenario. The rst one is Persistent Memory (PM), and
more specically, PM in a DIMM form factor that replaces server
memory and can be accessed by an application via
load
and
store
instructions. PM comes in many avors such as Intel Optane [
31
]
or battery-backed DRAM [
16
]
1
. Optane class PM has for instance
proved to be useful in mixed memory indices [
4
,
50
,
57
], and can of-
fer alternative ways to build a database system [
5
]. Battery-backet
class PM behaves as regular DRAM but is not volatile. The sec-
ond technology is RDMA-enabled networks [
30
]. These networks
transport data with negligible overhead and have been useful, for
instance, in query execution [
22
,
49
,
64
]. Just as with PM, RDMA-
enabled networks have also fostered new database designs [11].
PM and RDMA-enabled networks can also help to record and
propagate transaction log updates [
75
,
78
]. In particular, we con-
sider the case of Main-Memory Databases [
19
]. They can reach
unprecedented performance levels because they maintain all their
data in DRAM and persist only the transaction log, which there-
fore becomes their main bottleneck [
51
]. Figure 1 (left) depicts
how a typical system can perform log writing and shipping with
PM and RDMA. We can observe in the gure that the database
system is responsible for coordinating several dierent steps, some-
times targeting local PM, sometimes remote PM or memory via an
RDMA-enabled NIC, and lastly, fast SSD devices.
Each of these technologies oers a specic API and presents
some restrictions. The combination of these restrictions creates a
number of issues, including:
•
The interaction of RDMA and PM is complex and poorly under-
stood. For example, using RDMA to update a PM-backed address
on a remote machine may make the update visible, but it does not
1
The JEDEC standard, which supports DRAM interoperability, calls these NVDIMM-P
and NVDIMM-F types of persistent memory, respectively [23].
Session 14: Modern Hardware and In-memory DBMS
SIGMOD ’22, June 12–17, 2022, Philadelphia, PA, USA
评论