
OceanBase: A 707 Million tpmC Distributed Relational Database
System
Zhenkun Yang, Chuanhui Yang, Fusheng Han, Mingqiang Zhuang, Bing Yang, Zhifeng Yang,
Xiaojun Cheng, Yuzhong Zhao, Wenhui Shi, Huafeng Xi, Huang Yu, Bin Liu, Yi Pan, Boxue Yin,
Junquan Chen, Quanqing Xu
OceanBase
OceanBaseLabs@list.alibaba-inc.com
ABSTRACT
We have designed and developed OceanBase, a distributed re-
lational database system from the very basics for a decade. Being
a scale-out multi-tenant system, OceanBase is cross-region fault
tolerant, which is based on the shared-nothing architecture. Besides
sharing many similar goals with alternative distributed DBMS, such
as horizontal scalability, fault-tolerance, etc., our design has been
driven by the demands of typical RDBMS compatibility as well
as both on-premise and o-premise deployments. OceanBase has
fullled its design goal. It implements the salient features of cer-
tain mainstream classical RDBMS, and most applications on them
can run on OceanBase, with or without a few minor modications.
Tens of thousands of OceanBase servers have been deployed in
Alipay.com as well as many other commercial organizations. It has
also successfully passed the TPC-C benchmark test and seized the
rst place with more than 707 million tpmC. This paper presents
the goals, design criteria, infrastructure, and key components of
OceanBase including its engines for storage and transaction process-
ing. Further, it details how OceanBase achieves the above leading
TPC-C benchmark in a distributed cluster with more than 1,500
servers from 3 zones. It also describes lessons what we have learnt
in building OceanBase for more than a decade.
PVLDB Reference Format:
Zhenkun Yang, Chuanhui Yang, Fusheng Han, Mingqiang Zhuang, Bing
Yang, Zhifeng Yang, Xiaojun Cheng, Yuzhong Zhao, Wenhui Shi, Huafeng
Xi, Huang Yu, Bin Liu, Yi Pan, Boxue Yin, Junquan Chen, Quanqing Xu.
OceanBase: A 707 Million tpmC Distributed Relational Database System.
PVLDB, 15(12): 3385 - 3397, 2022.
doi:10.14778/3554821.3554830
PVLDB Artifact Availability:
The source code, data, and/or other artifacts have been made available at
https://github.com/oceanbase/obdeploy.
1 INTRODUCTION
Strong transaction guarantee, relational model, and excellently
expressible Structured Query Language (SQL) make Relational Data-
base Management System (RDBMS) the crucial information infras-
tructure of the majority of business systems. For the last three
This work is licensed under the Creative Commons BY-NC-ND 4.0 International
License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of
this license. For any use beyond those covered by this license, obtain permission by
emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights
licensed to the VLDB Endowment.
Proceedings of the VLDB Endowment, Vol. 15, No. 12 ISSN 2150-8097.
doi:10.14778/3554821.3554830
decades, the development of Internet platforms has facilitated the
ourishing global businesses, e.g., the likes of Alipay.com, Ama-
zon.com, and Taobao.com, serve the general populace instead of a
single organization. Classical centralized RDBMS are not capable
of meeting the requirements of the scalability, cross-region fault
tolerance, and cost-eectiveness of these businesses.
We launched the design and development of OceanBase [
6
,
7
], a
commodity hardware-based distributed relational database system
from the very basics, in May 2010. OceanBase has been rst used as
the Favorite of Taobao.com [
3
] in 2011, a service similar to the Wish
List of Amazon.com [
11
]. Thereafter, it was used by Alipay.com, in
2014, and by Zhejiang E-Commerce Bank in 2015, and many other
commercial banks, insurance companies, and other organizations
for communication and energy applications.
This paper rst presents the detailed design goals and criteria,
system architecture, SQL engine and multi-tenancy of OceanBase in
§2. Second, it presents an LSM-tree-based [
35
] storage engine, and
discusses the asymmetric read and write design, daily incremental
major compaction, and replica type in §3. Third, in §4, it proposes
the transaction processing engine including the timestamp ser-
vice, transaction processing, isolation level, and replicated table
in OceanBase. Fourth, in §5, we performed the TPC-C benchmark
test of OceanBase in 2020. §6 presents lessons learnt in building
OceanBase. §7 provides a brief review of the related work. Finally,
we conclude our work in §8. We briey list our contributions in the
following items.
•
We have built OceanBase, a distributed relational database
from the very basics, since 2010. As a scale-out multi-tenant
system, OceanBase is cross-region fault tolerant, and it sup-
ports the shared-nothing architecture. In case of the failure
of a minority of the nodes, its RPO (Recovery Point Objec-
tive) turns zero, and its RTO (Recovery Time Objective) is
less than 30 seconds.
•
We present an LSM-tree-based storage engine, which achieves
the performance close to that of the in-memory database
after multiple optimizations. An asymmetric read and write
data block storage system as well as a daily incremental
major compaction have been designed and implemented.
•
We propose a Paxos-based 2PC named OceanBase 2PC to
improve the distributed transaction processing capability
and reduce the transaction latency, which introduces the
Paxos protocol to 2PC, thus making the distributed transac-
tions have an automatic fault tolerance. Compared with the
traditional 2PC, the state of the coordinator does not persist
in OceanBase 2PC, thereby reducing the number of Paxos
3385
评论