暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
数据库顶会VLDB 2021最佳工业论文奖《RAMP-TAO Layering Atomic Transactions on Facebook's Online TAO Data Store》- Audrey Cheng.pdf
805
14页
1次
2021-08-19
免费下载
RAMP-TAO: Layering Atomic Transactions on Facebook’s
Online TAO Data Store
Audrey Cheng
UC Berkeley
Berkeley, USA
accheng@berkeley.edu
Xiao Shi
Facebook, Inc.
Cambridge, USA
xshi@fb.com
Lu Pan
Facebook, Inc.
Cambridge, USA
lupan@fb.com
Anthony Simpson
Facebook, Inc.
Cambridge, USA
asimpson96@fb.com
Neil Wheaton
Facebook, Inc.
Cambridge, USA
neilwheaton@fb.com
Shilpa Lawande
Facebook, Inc.
Cambridge, USA
slawande@fb.com
Nathan Bronson
Rockset
Cambridge, USA
ngbronson@rockset.com
Peter Bailis
Sisu Data
San Francisco, USA
peter@sisudata.com
Natacha Crooks
UC Berkeley
Berkeley, USA
ncrooks@berkeley.edu
Ion Stoica
UC Berkeley
Berkeley, USA
istoica@berkeley.edu
ABSTRACT
Facebook’s graph store TAO, like many other distributed data stores,
traditionally prioritizes availability, eciency, and scalability over
strong consistency or isolation guarantees to serve its large, read-
dominant workloads. As product developers build diverse applica-
tions on top of this system, they increasingly seek transactional
semantics. However, providing advanced features for select appli-
cations while preserving the system’s overall reliability and perfor-
mance is a continual challenge. In this paper, we rst characterize
developer desires for transactions that have emerged over the years
and describe the current failure-atomic (i.e., write) transactions
oered by TAO. We then explore how to introduce an intuitive read
transaction API. We highlight the need for atomic visibility guaran-
tees in this API with a measurement study on potential anomalies
that occur without stronger isolation for reads. Our analysis shows
that 1 in 1,500 batched reads reects partial transactional updates,
which complicate the developer experience and lead to unexpected
results. In response to our ndings, we present the RAMP-TAO pro-
tocol, a variation based on the Read Atomic Multi-Partition (RAMP)
protocols that can be feasibly deployed in production with mini-
mal overhead while ensuring atomic visibility for a read-optimized
workload at scale.
PVLDB Reference Format:
Audrey Cheng, Xiao Shi, Lu Pan, Anthony Simpson, Neil Wheaton, Shilpa
Lawande, Nathan Bronson, Peter Bailis, Natacha Crooks, Ion Stoica.
RAMP-TAO: Layering Atomic Transactions on Facebook’s Online TAO
Data Store. PVLDB, 14(12): 3014-3027, 2021.
doi:10.14778/3476311.3476379
Work done while at Facebook.
This work is licensed under the Creative Commons BY-NC-ND 4.0 International
License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of
Figure 1: By separating availability and replication concerns
from stronger isolation guarantees, we can maintain high
performance while ensuring safety properties over data.
1 INTRODUCTION
TAO is a read-optimized, geo-distributed data store that provides
online social graph access for diverse product applications and other
backend systems at Facebook [
21
]. Each application-level request
can result in hundreds of reads and writes to TAO. In aggregate, TAO
serves over ten billion reads and tens of millions of writes per second
on a changing data set of many petabytes. Like many other large
storage systems [
2
,
22
,
23
,
26
,
41
], TAO prioritizes availability, read
latency, and eciency over strong data consistency and isolation
guarantees [45] for its demanding, read-dominant workload.
Originally, TAO focused on simple accesses to nodes and edges
in the social graph and provided no transactional APIs, reecting its
goal of supporting a small feature set with high availability at mas-
sive scale. However, as applications shifted from directly accessing
this license. For any use beyond those covered by this license, obtain permission by
emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights
licensed to the VLDB Endowment.
Proceedings of the VLDB Endowment, Vol. 14, No. 12 ISSN 2150-8097.
doi:10.14778/3476311.3476379
3014
TAO to a higher-level query framework, Ent, which makes it easy
to express complex operations over the graph, product developers
have increasingly desired transactional semantics to avoid having
to handle partial failures. In response, TAO engineers implemented
failure-atomic write transactions (Section 2).
Similarly, adding support for read transactions would greatly
simplify the developer experience by directly enforcing application-
level invariants. Currently, under TAO’s eventual consistency, a
naïve batched query can observe fractured reads [
17
]—reads that
capture only part of a write transaction’s updates before these up-
dates are fully replicated. As we demonstrate in the rst large-scale
measurement study of its kind, these anomalies occur 1 out of
every 1,500 read batches (Section 3). Given the size of TAO’s work-
load, this relatively modest rate is signicant in practice. Moreover,
fractured reads are hard to detect in the application layer with an
asynchronously replicated system. These anomalies are burden-
some for developers to reason about and explicitly handle in order
to minimize end user impact.
By providing atomic visibility [
17
], or the guarantee that reads
observe either all or none of a transaction’s operations, we can intro-
duce a simple and intuitive read transaction API on TAO. However,
enabling these semantics presents signicant challenges in practice.
Due to TAO’s scale and storage constraints, we want to avoid coor-
dination and minimize memory overhead. Our implementation of
atomic visibility must also be cache-friendly, hot-spot tolerant, and
extensible to dierent data stores. Moreover, we should only incur
overhead for applications that opt in (rather than requiring every
application to pay a performance penalty). Although we focus on
TAO in this work, these challenges apply to many other large-scale,
read-optimized systems [2, 41, 49].
In this paper, we introduce a new RAMP-TAO protocol (Sec-
tion 4), which layers atomic visibility on top of TAO while achieving
our performance goals above. While our work is inspired by the
Read Atomic Multi-Partition (RAMP) protocols [
17
], we address
several of their drawbacks. The original RAMP transactions impose
atomic visibility verication overhead on all reads, require substan-
tial metadata, and assume full support for multiversioning (which
TAO lacks). RAMP-TAO leverages the key insight that we only need
to guard against fractured reads for recent, transactionally-updated
data to reduce the overheads of ensuring atomic visibility.
Our layering strategy takes the “bolt-on” [
18
] approach to stronger
transactional guarantees (Figure 1). We prevent transactions from
interfering with TAO’s availability, durability, and replication pro-
cesses, retaining the reliability and eciency of the system. Only
applications that need stronger guarantees incur the resulting per-
formance costs. Furthermore, our protocol exploits existing cache
infrastructure and requires minimal changes to TAO internals. Thus,
RAMP-TAO is eective for both providing default guarantees across
data stores and as a retroactive optimization for massive, read-
optimized systems, many of which have sought to strengthen their
isolation models [
1
,
4
,
5
]. We also describe an optimization of our
protocol for bidirectional associations (paired edges in the graph),
which represent a special case of failure-atomic transactions. These
data structures are ubiquitous in TAO, so any protocol providing
atomic visibility needs to be especially ecient for them.
We demonstrate that RAMP-TAO is feasible for production use
by benchmarking its latency and memory overheads (Section 5).
Figure 2: Subgraph for a hypothetical example.
Our prototype implementation provides atomic visibility in a read-
optimized environment with one round trip for greater than 99.93%
of reads and a modest 0.42% increase in memory overhead.
In summary, we make the following contributions in this paper:
We report on developer challenges and needs for transactional
semantics within Facebook’s social graph serving ecosystem
(Section 2).
We present a quantitative study of atomic visibility violations
derived from production data to demonstrate the importance of
providing this guarantee in a read transaction API.
We describe a novel RAMP-TAO protocol (Section 4) to e-
ciently provide atomic visibility for an eventually consistent,
read-optimized system.
We demonstrate the production feasibility of RAMP-TAO by
showing it incurs minimal overhead and requires only one round
trip for the vast majority of reads (Section 5).
2 OVERVIEW AND MOTIVATION
TAO provides online access to the social graph at Facebook [
21
]. It is
implemented using two layers of graph-aware caches that mediate
access to the statically-sharded MySQL database [
35
]. Updates are
replicated asynchronously via the Wormhole pub-sub system [
44
].
TAO prioritizes low-latency, scalable operations and thus opts
for weaker consistency and isolation models to serve its demanding,
read-dominant workloads. TAO provides point get, range, and count
queries, as well as operations to create, update, and delete objects
(nodes) and associations (edges). Its simple graph API is conducive
to maintaining high reliability and is sucient for the vast majority
of applications at Facebook. As new applications emerge and as
Ent, our higher-layer query framework, evolves, we have been
exploring, designing, and implementing additional features such as
transactions while preserving TAO’s reliability and eciency.
In this section, we explain the types of transactional guarantees
developers desire and highlight corner cases they need to handle be-
fore system-level options are oered. We then describe the current
approaches to providing failure atomicity. Finally, we demonstrate
the importance of atomic visibility for an intuitive read transaction
API and considerations for providing stronger guarantees at scale.
2.1 An example
Consider a hypothetical social media product built on top of TAO,
with user nodes, media nodes, edges when a user has composed a
piece of sheet music, and edges when a user has recorded a song.
This product enables musicians to share their sheet music and cor-
responding recordings together. Let us say that Alice wants to share
a piece of music she has composed and recorded so that others can
view the sheet music while listening to the recording. The applica-
tion writes the following edges together (the resulting subgraph is
shown in Figure 2):
3015
of 14
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜