
consumption and memory capacity. Specifically, the snapshot
dependency chain is divided into batches, with each index
relying on the previous indexes in its batch rather than on
snapshots outside the batch. By sharing the subtrees in a
batch, the data block address can be obtained directly, and then
the index memory overhead is decreased. Besides, prefetching
indexes in a batch instead of all dependent snapshot indexes
could significantly reduce the number of snapshots and the
recovery time.
Consequently, we present a Batch-Based Snapshot index,
called BBS. BBS has two key techniques, which are Shared-
Subtrees Indexing and Batch-Based Dividing. The former sets
two types of indexes in a batch. The first snapshot index
is a complete B+ tree, and the subsequent snapshots are
partial B+ trees. The complete tree no longer depends on the
former indexes, while the partial tree nodes point to previous
indexes in the batch, effectively reducing redundancy. The
latter batches according to the snapshot update ratio and the
recent average continuous access length. We integrate BBS
into Ceph RBD, which supports snapshot and continuous
recovery with low latency. We employ workloads to evaluate
the performance of the working system.
To sum up, we make the following contributions:
• We propose BBS, a high-performance snapshot index that
could directly locate data blocks and balance the overhead
of recovery time and index memory consumption.
• We design a batch-based snapshot deletion method to
adapt to the sparse snapshot deletion mechanism, which
could effectively speed up snapshot deletion.
• We implement BBS based on Ceph RBD. Evaluation
shows that BBS gains significant performance improve-
ment against advanced block storage systems.
The rest of this paper is organized as follows. Section
II presents the background. The motivation is introduced
in Section III. The design and implementation of BBS are
proposed in Section IV. The performance evaluation is shown
in Section V. Section VII discusses related work, and Section
VIII concludes the paper.
II. BACKGROUND
A. Snapshot Implementations
Snapshot technologies mainly include CoW, RoW, and their
variants, which have been extensively compared and analyzed
[19]–[22]. Briefly, the double write mechanism used by CoW
introduces overheads for regular I/O and a dramatic increase
in sync operations [22], [23]. The advantage of CoW is that
snapshot recovery is faster since only the original and current
snapshots are needed. Snapshots taken by the volume in
Rackspace Cloud Block Storage [24] , Google Cloud Compute
Engine Persistent Disk [25] and virtual disk in Microsoft
Azure [26] are CoW snapshots. The double write mechanism
in CoW introduces huge overhead in intensive snapshots and
reduces the throughput of the main database systems, so we
do not consider CoW.
For RoW-based snapshots, new data blocks are written to
the blocks belonging to the current snapshot. RoW sacrifices
contiguity in the original copy for lower overhead updates.
As the system captures extensive snapshots, the fragmentation
of the data block becomes increasingly severe. When locating
the data block, the snapshot checks whether it exists in the
snapshot. Otherwise, the snapshot searches earlier snapshots
by the root nodes until the data block is located. Thus the
process inevitably increases the number of hops. Iterative
search degrades performance due to the low update frequency
of data blocks and the massive amount of snapshots. Amazon
EBS [1] has an optimized RoW snapshot index that cuts off
the index dependencies. It copies the entire previous snapshot
index before updating the data blocks. Besides, the index uses
a uniform region to store continuous unmodified data blocks
to reduce the number of nodes. However, this index is not
applicable in batch access mode since the indexes are loaded
iteratively. Moreover, when the number of snapshots grows,
the size of each region shrinks, and the number of regions
grows, the index degenerates into a complete index, which
increases memory consumption.
B. Vary Layers Supported Snapshot
TABLE I: Summary of existing methods for database backup
and recovery (FS: file system, DB: database)
Methods Mysqldump Xtrabackup BTRFS LVM CoW&RoW
layer DB DB FS Block Cloud
backup speed slow slow fast fast fast
restore speed slow medium fast fast slow
Snapshot technologies can be implemented across various
layers, as shown in Table I. In the database layer, an ad-
ministrator could choose Mysqldump [27] for logical backup
and Percona Xtrabackup [28] for physical backup. A logical
backup stores the queries executed by transactions, while
a physical backup copies the raw data to storage. In the
recovery process, the stored queries are re-executed, or backup
data is copied to a database directory. However, recovery
procedures involve heavy I/O operations by database queries.
Xtrabackup does not perform transactions and only copies
the original data, resulting in a faster restoration compared to
Mysqldump. Whereas this backup approach incurs significant
storage overhead.
Snapshot technologies can also be implemented at the file
systems or the block layer. Each snapshot consists of a separate
tree, but the snapshot may share subtrees with other snapshots.
When the user produces a snapshot of the volume, it simply
duplicates the root node of the original volume as the new tree
root node of the snapshot, and the new tree root is pointed to
the same children as the original volume. Though creation
overhead is light, the overhead of the first write to a block
is heavy because the entire tree of meta-data nodes needs
to be copied and linked to the root node of the snapshot.
LVM [29] operates between the file system and the storage
device and provides fast snapshot creation and restoration
using CoW. However, the CoW approach negatively affects
run-time performance since it performs redundant writes due
to data copies.
文档被以下合辑收录
评论