A Drop-in Middleware for Serializable DB Clustering across Geo-distributed Sites.pdf

上善若水

121

14页

5次

2022-05-30

免费下载

A Drop-in Middleware for Serializable DB Clustering

across Geo-distributed Sites

Enrique Saurez

, Bharath Balasubramanian

, Richard Schlichting

Brendan Tschaen

Shankaranarayanan Puzhavakath Narayanan,

Zhe Huang

, Umakishore Ramachandran

Georgia Institute of Technology

AT&T Labs - Research

United States Naval Academy

esaurez@gatech.edu, bharathb@research.att.com, schlicht@usna.edu,

bt054f@att.com, {snarayanan, zhehuang}@research.att.com, rama@gatech.edu

ABSTRACT

Many geo-distributed services at web-scale companies still

rely on databases (DBs) primarily optimized for single-site

performance. At AT&T this is exempliﬁed by services in the

network control plane that rely on third-party software that

uses DBs like MariaDB and PostgreSQL, which do not pro-

vide strict serializability across sites without a signiﬁcant

performance impact. Moreover, it is often impractical for

these services to re-purpose their code to use newer DBs op-

timized for geo-distribution. In this paper, a novel drop-in

solution for DB clustering across sites called Metric is pre-

sented that can be used by services without changing a single

line of code. Metric leverages the single-site performance of

an existing service’s DB and combines it with a cross-site

clustering solution based on an entry-consistent redo log

that is speciﬁcally tailored for geo-distribution. Detailed

correctness arguments are presented and extensive evalu-

ations with various benchmarks show that Metric outper-

forms other solutions for the access patterns in our produc-

tion use-cases where service replicas access diﬀerent tables

on diﬀerent sites. In particular, Metric achieves up to 56%

less latency and 5.2x higher throughput than MariaDB and

PostgreSQL clustering, and up to 90% less latency and 26x

higher throughput than CockroachDB and TiDB, systems

that are designed to support geo-distribution.

PVLDB Reference Format:

Enrique Saurez, Bharath Balasubramanian, Richard Schlichting,

Brendan Tschaen, Shankaranarayanan Puzhavakath Narayanan,

Zhe Huang, Umakishore Ramachandran. A Drop-in Middle-

ware for Serializable DB Clustering across Geo-distributed Sites.

PVLDB, 13(12): 3340-3353, 2020.

DOI: https://doi.org/10.14778/3415478.3415555

1. INTRODUCTION

Services built by AT&T and other web-scale companies

such as Google and Amazon are often deployed across geo-

distributed sites to satisfy the locality, availability, and per-

This work is licensed under the Creative Commons Attribution-

NonCommercial-NoDerivatives 4.0 International License. To view a copy

of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For

any use beyond those covered by this license, obtain permission by emailing

info@vldb.org. Copyright is held by the owner/author(s). Publication rights

licensed to the VLDB Endowment.

Proceedings of the VLDB Endowment, Vol. 13, No. 12

ISSN 2150-8097.

DOI: https://doi.org/10.14778/3415478.3415555

formance needs of clients.

However, many of these services

use databases (DBs) like MariaDB [54] and PostgreSQL [50]

that are primarily optimized for single site deployments even

when they have clustering solutions. For example, in Mari-

aDB Galera [28] synchronous clustering [29] all replicas are

updated on each commit, which is prohibitively expensive

across sites with WAN latencies on the order of hundreds

of milliseconds. Similarly, in PostgreSQL master-slave [49]

clustering, requests from all sites are sent to a single master

replica, compromising on performance and availability.

Although new geo-distributed DBs have been developed

that improve the performance of cross-site transactionality

(e.g., Spanner [19], CockroachDB [17], TiDB [48]), it is often

impractical to re-purpose the code of services tied to spe-

ciﬁc DBs to use these new solutions. This is especially true

when the existing service involves third-party software. For

example, AT&T’s multi-site Service Orchestrator (SO [42])

that deploys complex virtual network functions (VNFs) re-

lies on a third-party business process engine Camunda [8]

that maintains state in MariaDB. Similarly, AT&T’s Data

Collection, Analytics and Events service (DCAE [21]) relies

on a third-party tool called Cloudify [16] that uses Post-

greSQL.

While middleware for DB clustering does exist, it

does not provide multi-master strict serializibility [15, 34]

and/or requires extensive annotation of service code [30].

In this paper, we present a novel solution called Metric

that serves as a replacement for existing DB clustering so-

lutions. The primary challenge in designing such a system

is to satisfy all of the following goals simultaneously:

• Require no changes or annotations of the service code

or its DB aside from turning oﬀ the latter’s default

clustering solution across sites; in other words, it

should serve as a drop-in solution.

• Provide the service or user of the middleware the ab-

straction of a replicated multi-master DB across sites,

where all replicas can concurrently process requests.

• Guarantee that all transactions are strictly serializ-

able [47, 33].

• Build a system that outperforms a service’s existing

DB clustering solution, in terms of the end-to-end la-

tency for transaction execution and throughput.

A site is a data center at a physical location connected with

other sites through a wide-area-network (WAN).

Both SO and the DCAE are part of AT&T’s network con-

trol plane, which is open-sourced through the Open Network

Automation Platform (ONAP) eﬀort [44].

3340

Figure 1: Overview of a Metric deployment where in-

stances of Metric and the SQL DB are deployed on each site

with a geo-distributed redo log deployed across the sites.

Service replicas issue requests to the Metric process closest

to them.

• Support new DBs easily with minimal additions or

changes to the middleware.

Metric achieves these goals through a novel design that

leverages the single-site guarantees of the service’s existing

DB coupled with the use of an entry-consistent (EC) key-

value store [5, 3] to maintain a geo-distributed redo log of

DB records. The EC store provides critical functionality in

the form of fault-tolerant lock-based critical sections that

are used by Metric to obtain table-level locks that guaran-

tee exclusive access to the latest values of records accessed

by a given transaction. Figure 1 illustrates this approach,

where a Metric process executes the operations in a trans-

action locally on a DB, with just one round-trip per trans-

action across sites to commit modiﬁed records in the EC

log. This is operationally not only much more eﬃcient than

both MariaDB synchronous and PostgreSQL master-slave

clustering, but is also similar to optimized geo-distributed

DBs [19, 17], despite the drop-in nature of the Metric so-

lution. Further, Metric makes eﬀective use of the EC

store’s higher level abstraction of critical sections that han-

dle failures to build a redo log. The above-mentioned geo-

distributed DBs design their redo log from ﬁrst principles—

a complex error prone process, especially considering the

wider array of failures in geo-distributed systems.

A key aspect of Metric’s drop-in solution is that it sup-

ports general SQL queries and parses the query to determine

automatically the potentially impacted tables over which to

acquire table-level locks. For our use-cases such as DCAE

and SO, transactions are naturally partitioned across ser-

vice replicas and have no overlap in the DB records they

access, meaning that a given table is usually accessed only

by processes within a speciﬁc site. For example, SO replicas

typically deploy diﬀerent VNFs, with each replica modify-

ing records in distinct DB schemas. An SO replica requires

access to another replica’s records only when the latter fails,

in order to complete VNF deployments. For this common

usage pattern Metric achieves optimal performance.

Metric is implemented in Java with support for MariaDB

and PostgreSQL [25]. Services use the middleware by replac-

ing their existing JDBC driver [62] with the Metric JDBC

driver for the choice of their DB. Through the use of SQL

triggers and basic SQL parsing, we ensure that support for

a new DB can be added with less than 1000 LOC, consisting

mainly of boilerplate code for initialization, trigger manage-

ment, and the mapping of DB data types to Java data types.

We evaluated Metric with strict serializability in multi-

site settings across diﬀerent WAN latency proﬁles using

micro-benchmarks, use-case workloads, and TPC-C work-

loads. For the access patterns described above where ser-

vice replicas access diﬀerent tables on diﬀerent sites, Metric

achieves up to 56% less latency and 5.2x higher through-

put than MariaDB’s Galera synchronous clustering solution

and PostgreSQL’s master-slave clustering solution. Metric

also outperforms DBs optimized for geo-distribution on the

same access patterns, and achieves up to 90% less latency

and 26.2x higher throughput than CockroachDB and TiDB.

We also evaluated Metric for access patterns that devi-

ate from the expected workload, where service replicas fre-

quently access the same tables across sites. As expected,

Metric’s performance drops relative to the other solutions

mentioned above, which demonstrate up to 90% less latency

and 32x higher throughput than Metric. We present several

mitigation strategies in §9 as future work.

In summary, this paper makes these contributions:

• A novel approach to providing drop-in DB clustering

across sites supported by detailed correctness argu-

ments showing strict serializability (§3, §4).

• An implementation [25] with clustering support for

MariaDB and PostgreSQL that is being deployed in

production for multiple use-cases and that is open-

sourced through ONAP (§5, §6).

• Experimental results validating Metric’s eﬀectiveness

(§7).

A previous Metric workshop paper oriented towards edge

use-cases [57] presents some of the initial design ideas related

to the system. These include how the ownership API can

be exposed to a client or service, and an approach for guar-

anteeing transactionality only to the owner of certain tables

in the DB. While we retain the name for legacy reasons,

this paper signiﬁcantly extends these concepts to encom-

pass strict serializability guarantees, and presents complete

correctness arguments, details of an implementation, and an

experimental evaluation.

2. ARCHITECTURE AND OVERVIEW

Architecture. Metric provides the abstraction of a repli-

cated geo-distributed DB that can be accessed by applica-

tions implementing higher-level services. As shown in Fig-

ure 1, each application is generally composed of multiple

service replicas that are hosted on diﬀerent sites for locality,

availability, and fault-tolerance. Metric itself is also geo-

distributed, with one replica per site. Service replicas sub-

mit transactions to the closest Metric process, usually at the

same site and often on the same machine. A Metric process

is in turn associated with an instance of a SQL DB, referred

to as the Metric process’s local DB. Currently, a given multi-

site Metric deployment supports a single type of SQL DB

(e.g., MariaDB, PostgreSQL), where the choice is based on

application requirements. Each local DB contains at least

the records accessed by transactions submitted to the Metric

process at that site. The DB must support strict serializ-

ability, at least for transactions within the same node.

avoid conﬂicts and optimize performance, any internal cross-

site clustering facility provided by the DB is disabled (e.g.,

Galera clustering for MariaDB.) However, the DBs can use

their clustering solution within a site as long as that solution

provides strict serializability for transactions.

The name is no longer considered an acronym, however.

§9 describes strategies to relax this assumption

3341

of 14

免费下载

tidb paper

关注

评论