VLDB2024_《FluidKV：Seamlessly Bridging the Gap between Indexing Performance and Memory-Footprint on Ultra-Fast Storage》_腾讯云+华中科技大学.pdf

迹部景吾

617

14页

5次

2024-09-09

免费下载

FluidKV: Seamlessly Bridging the Gap between Indexing

Performance and Memor y-Footprint on Ultra-Fast Storage

Ziyi Lu

Huazhong University of Science and

Technology

China

luziyi@hust.edu.cn

Qiang Cao

∗

Huazhong University of Science and

Technology

China

caoqiang@hust.edu.cn

Hong Jiang

UT Arlington

TX, USA

hong.jiang@uta.edu

Yuxing Chen

Tencent Inc.

China

axingguchen@tencent.com

Jie Yao

Huazhong University of Science and

Technology

China

jackyao@hust.edu.cn

Anqun Pan

Tencent Inc.

China

aaronpan@tencent.com

ABSTRACT

Our extensive experiments reveal that existing key-value stores

(KVSs) achieve high performance at the expense of a huge mem-

ory footprint that is often impractical or unacceptable. Even with

the emerging ultra-fast byte-addressable persistent memory (PM),

KVSs fall far short of delivering the high performance promised by

PM’s superior I/O bandwidth. To nd the root causes and bridge

the huge performance/memory-footprint gap, we revisit the ar-

chitectural features of two representative indexing mechanisms

(single-stage and multi-stage) and propose a three-stage KVS called

FluidKV. FluidKV eectively consolidates these indexes by fast and

seamlessly running incoming key-value request stream from the

write-concurrent frontend stage to the memory-ecient backend

stage across an intermediate stage. FluidKV also designs important

enabling techniques, such as thread-exclusive logging, PM-friendly

KV-block structures, and dual-grained indexes, to fully utilize both

parallel-processing and high-bandwidth capabilities of ultra-fast

storage hardware while reducing the overhead. We implemented a

FluidKV prototype and evaluated it under a variety of workloads.

The results show that FluidKV outperforms the state-of-the-art PM-

aware KVSs, including ListDB and FlatStore with dierent indexes,

by up to 9

and 3.9

in write and read throughput respectively,

while cutting up to 90% of the DRAM footprint.

PVLDB Reference Format:

Ziyi Lu, Qiang Cao, Hong Jiang, Yuxing Chen, Jie Yao, and Anqun Pan.

FluidKV: Seamlessly Bridging the Gap between Indexing Performance and

Memory-Footprint on Ultra-Fast Storage. PVLDB, 17(6): 1377 - 1390, 2024.

doi:10.14778/3648160.3648177

PVLDB Artifact Availability:

The source code, data, and/or other artifacts have been made available at

https://github.com/luziyi23/FluidKV.

∗

Qiang Cao is the corresponding author.

This work is licensed under the Creative Commons BY-NC-ND 4.0 International

License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of

this license. For any use beyond those covered by this license, obtain permission by

emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights

licensed to the VLDB Endowment.

Proceedings of the VLDB Endowment, Vol. 17, No. 6 ISSN 2150-8097.

doi:10.14778/3648160.3648177

1 INTRODUCTION

Persistent memory (PM) and solid-state drive (SSD) have made

great strides in both I/O bandwidth and latency over the past decade.

PM, with its byte-addressability and fast persistency, is providing

researchers with an opportunity to reshape the storage landscape.

Meanwhile, key-value stores (KVSs), as a building block of modern

data-processing platforms, maintain a global index on a series of

persistent key-value pairs (KVs) to support simple-semantic KV-

accesses (e.g., Get/Put/Scan). Therefore, KVS on ultra-fast storage

has become the eye of the KVS research storm.

At the heart of a KVS is its indexing structure, and, depending

on how keys are organized and operated on in this structure, for

the purpose of this paper we broadly divide KVSs into two cat-

egories, single-stage indexing KVS and multi-stage indexing KVS.

Specically, the single-stage indexing KVSs have a monothetic data

structure (e.g., B+-tree) in DRAM, storage, or DRAM-storage hy-

brid, to index all persistent KVs. On the other hand, the multi-stage

indexing KVSs, e.g., Log-structured merge-tree (LSM-tree [

]), dy-

namically and periodically migrate the incoming KVs from a small

in-memory KV-set to in-storage large KV-sets, each of which has

its own corresponding index.

Existing PM-based KVSs, be they single-stage and multi-stage

indexing, fall far short of achieving the high performance promised

by ultra-fast storage (e.g., PM) without heavily relying on a huge

DRAM capacity, as elaborated by our in-depth experimental analy-

sis in §2 and summarized by the following 5 key observations.

For the single-stage indexing KVSs, the DRAM-only indexing

ones, where the entire pivot index resides in DRAM [

], show

superior concurrency and performance (Observation 1) at the cost

of a huge DRAM footprint, making it dicult to adapt to the growth

of data volume (Observation 2). Storing part of the index (e.g., leaf

nodes) on PM [

] reduces DRAM footprint but sac-

rices the overall performance.

The multi-stage indexing KVSs, such as LSM-tree, build an in-

memory KV-grained index only for a limited amount of incoming

unsorted data in the rst stage and small coarse-grained indexes for

the other stages, making DRAM footprint controllable (Observation

3), while introducing the notorious write stalls and write/read am-

plications due to the periodic compactions to merge KVs between

1377

adjacent stages. Recently, PM-aware LSM-trees [12, 30, 62] are de-

signed to optimize for PM idiosyncrasy and reduce write stalls.

However, both read and write performances of such multi-stage

indexing KVSs remain far lower than DRAM-only single-stage

indexing KVSs due to limited write-concurrency (Observation 4).

In addition, both single-stage and multi-stage indexing KVSs still

insuciently utilize the bandwidth of PM (Observation 5).

In other words, it is very challenging for existing KVS indexing

structures, single-stage or multi-stage, to achieve high write/read

throughput and controllable DRAM footprint simultaneously. For-

tunately, Observation 5 oers a hint to eectively and eciently

leverage the power of modern hardware. To this end, we propose a

fast-owing three-stage KVS architecture, FluidKV, to seamlessly

consolidate such two indexing mechanisms. The key idea behind

FluidKV is to combine the high concurrency of single-stage indexes

and the controllable DRAM consumption of multi-stage indexes by

quickly and eciently merging KVs from the former to the latter.

FluidKV comprises three consecutive processing stages, i.e., Fast-

Store, BuerStore, and StableStore. FastStore employs a concurrent

and key-grained index, along with thread-exclusive logging, to

fast absorb incoming KVs in a sequential but unsorted manner,

thus achieving high write performance. Adapting to available hard-

ware bandwidth and uctuating workload, BuerStore dynamically

ushes FastStore data into a series of persisted and sorted KV-

sets and merge-sorts them into the backend StableStore, to control

the memory footprint in time. StableStore maintains a global key-

range-grained index on large-scale persistent data, thus minimizing

the DRAM footprint while maintaining read performance. FluidKV

presents a PM-friendly index and data block structure in BuerStore

and StableStore to store the persisted index and KVs to eciently

exploit the ultra-fast storage.

We implement a FluidKV prototype and evaluate it under a

variety of workloads. The results show that FluidKV outperforms

ListDB, a state-of-the-art PM-aware multi-stage indexing KVS, by

up to 9x and 3.8x in write and read throughput respectively while

cutting 90% of ListDB’s DRAM footprint. Compared to state-of-the-

art single-stage indexing KVSs, FluidKV also ensures a controllable

DRAM footprint with similar or higher read/write performance.

The contributions of this paper include:

•

An in-depth experimental analysis of the performance/memory-

footprint gap among representative KVS indexing mechanisms;

•

A three-stage fast-owing KVS architecture that eectively uti-

lizes the I/O capacity and parallel-processing capability of ultra-

fast storage;

•

A write-optimized rst stage (FastStore), an adaptive second

stage for fast data migration (BuerStore), and a DRAM-footprint-

cutting PM-aware backend third stage (StableStore);

•

Evaluation of a FluidKV prototype against representative PM-

aware KVSs demonstrating high write performance, low DRAM

footprint, and acceptable read performance.

2 BACKGROUND AND ANALYSIS

In this section, we provide the necessary background for and an in-

depth analysis of the characteristics of emerging high-performance

storage devices and indexing mechanisms of existing key-value

stores (KVSs), which help reveal their performance pitfalls.

Key-value data

DRAM

Volatile

index

(a) Single-stage indexing

DRAM

Index

Key-value data

Key-value

Key-value data

Stage 0

Stage 1

Stage 2

Stage n

…

flush

compaction

(b) Multi-stage indexing

Figure 1: Classication of KVS indexing mechanism.

2.1 Ultra-Fast Storage

The rapid development of solid-state drive (SSD) and persistent

memory (PM) technologies has unceasingly advanced both stor-

age capacity and performance, especially accelerating bandwidth

with increasing parallelism. Dierent from traditional block devices

such as SSD and hard-disk drives (HDD), PM (e.g., PCM [

], STT-

MRAM [

], Memristor [

], 3DXPoint[

], Memory-Semantic SSD

[

]) is capable of memory semantics (i.e., load/store) and

accesses to byte-level small-sized data at GB/s-level I/O bandwidth

and 100ns-level latency. Because of the superior byte-addressing

capability and fast persistency, KVSs as a fundamental building

block of modern data-processing infrastructure leverage PM to ac-

celerate intensive small-sized KV workloads [

] that are

prevalent in industrial and commercial applications [

]. Nowadays,

PM not only works as main memory, which can be accessed via

memory bus for low latency, but can also be attached on the PCIe

bus with the emerging CXL technology [1] for high scalability.

2.2 Key-value Store Indexing

A KVS generally consists of indexes and persistent data. In this

paper, we focus on the index structure, which is the core of KVS

for accurate and quick access to the persistent data on storage. To

understand the impact of dierent indexing mechanisms on the

overall performance of a KVS, we rst classify the existing KVS

indexes into two groups, i.e., single-stage indexing (e.g., B+-tree)

and multi-stage indexing (e.g., LSM-tree) as shown in Figure 1, and

identify their respective performance characteristics and pitfalls

(Observation 1~5) through experiments.

2.2.1 Single-stage indexing. A single-stage indexing KVS main-

tains a monolithic KV-grained index to precisely record the location

of each KV in the persistent data, as shown in Figure 1a. Common

single-stage indexes include range indexes (e.g., B-tree variants,

trie, and skiplist) and hash indexes. Considering that KVSs require

support for range queries, this paper focuses only on range indexes.

Most KVSs optimized for ultra-fast storage keep the whole index

in DRAM to prevent the indexing from becoming a bottleneck. For

example, KVell [

] adopts a large B+-tree index and page cache

in memory to ensure read performance on fast SSDs. Flatstore [9]

builds an ecient multi-log structure on PM for persistent KVs and

employs an existing volatile index for fast searching.

To demonstrate the impact of single-stage indexing on KVS per-

formance, we test the read and write performances of Flatstore with

dierent numbers of user threads under workloads of 200M writes

and reads respectively. The sizes of key and value are equal and

8 bytes each (see §6.1 for detailed experimental setup). As shown

in Figure 2, Masstree, which denotes Flatstore with a DRAM-only

B+-tree (i.e., Masstree [

]), achieves read and write throughputs

1378

of 14

免费下载

文档被以下合辑收录

VLDB2024 数据库顶会论文（共31篇）

本合辑收录了VLDB2024 数据库顶会论文。

关注

文档被以下合辑收录

评论