暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
Manu- A Cloud Native Vector Database Management System.pdf
461
14页
8次
2022-10-09
免费下载
Manu: A Cloud Native Vector Database Management System
Rentong Guo
, Xiaofan Luan
, Long Xiang
, Xiao Yan
, Xiaomeng Yi
, Jigao Luo
†§
Qianya Cheng
, Weizhi Xu
, Jiarui Luo
, Frank Liu
, Zhenshan Cao
, Yanliang Qiao
, Ting Wang
Bo Tang
, Charles Xie
Zilliz
Southern University of Science and Technology
§
Technical University of Munich
{rstname.lastname}@zilliz.com
{xiangl3@mail., yanx@, 11911419@mail., tangb3@}sustech.edu.cn
§
jigao.luo@tum.de
ABSTRACT
With the development of learning-based embedding models, embed-
ding vectors are widely used for analyzing and searching unstruc-
tured data. As vector collections exceed billion-scale, fully managed
and horizontally scalable vector databases are necessary. In the
past three years, through interaction with our 1200+ industry users,
we have sketched a vision for the features that next-generation
vector databases should have, which include long-term evolvability,
tunable consistency, good elasticity, and high performance.
We present
Manu
, a cloud native vector database that imple-
ments these features. It is dicult to integrate all these features
if we follow traditional DBMS design rules. As most vector data
applications do not require complex data models and strong data
consistency, our design philosophy is to relax the data model and
consistency constraints in exchange for the aforementioned fea-
tures. Specically,
Manu
rstly exposes the write-ahead log (WAL)
and binlog as backbone services. Secondly, write components are
designed as log publishers while all read-only analytic and search
components are designed as independent subscribers to the log ser-
vices. Finally, we utilize multi-version concurrency control (MVCC)
and a delta consistency model to simplify the communication and
cooperation among the system components. These designs achieve
a low coupling among the system components, which is essential
for elasticity and evolution. We also extensively optimize
Manu
for
performance and usability with hardware-aware implementations
and support for complex search semantics.
Manu
has been used
for many applications, including, but not limited to, recommenda-
tion, multimedia, language, medicine and security. We evaluated
Manu
in three typical application scenarios to demonstrate its e-
ciency, elasticity, and scalability.
PVLDB Reference Format:
Rentong Guo, Xiaofan Luan, Long Xiang, Xiao Yan, Xiaomeng Yi, Jigao Luo,
Qianya Cheng, Weizhi Xu, Jiarui Luo, Frank Liu, Zhenshan Cao, Yanliang
Qiao, Ting Wang, Bo Tang, and Charles Xie. Manu: A Cloud Native Vector
Database Management System. PVLDB, 15(12): 3548 - 3561, 2022.
doi:10.14778/3554821.3554843
Co-rst-authors are ordered alphabetically.
Work done while working with Zilliz, correspondence to Bo Tang.
This work is licensed under the Creative Commons BY-NC-ND 4.0 International
License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of
this license. For any use beyond those covered by this license, obtain permission by
emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights
licensed to the VLDB Endowment.
Proceedings of the VLDB Endowment, Vol. 15, No. 12 ISSN 2150-8097.
doi:10.14778/3554821.3554843
PVLDB Artifact Availability:
The source code, data, and/or other artifacts have been made available at
https://github.com/milvus-io/milvus/tree/2.0.
1 INTRODUCTION
According to IDC, unstructured data, such as text, images, and video,
took up about 80% of the 40,000 exabytes of new data generated in
2020, their percentage keeps rising due to the increasing amount
of human-generated rich media [
47
]. With the rise of learning-
based embedding models, especially deep neural networks, using
embedding vectors to manage unstructured data has become com-
monplace in many applications such as e-commerce, social media,
and drug discovery [
48
,
62
,
67
]. A core feature of these applica-
tions is that they encode the semantics of unstructured data into a
high-dimensional vector space. Given the representation power of
embedding vectors, operations like recommendation, search, and
analysis can be implemented via similarity-based vector search. To
support these applications, many specialized vector databases are
built to manage vector data [10, 12, 17–19, 80].
In 2019, we open sourced Milvus [
80
], our previous vector data-
base, under the LF AI & Data Foundation. Since then, we collected
feed-backs from more than 1200 industry users and found that some
of the design principles adopted by Milvus are not suitable. Milvus
followed the design principles of relational databases, which are
optimized for either transaction [
51
] or analytical [
80
] workloads,
and focused on functionality supports (e.g., attribute ltering and
multi-vector search) and execution eciency (e.g., SIMD and cache
optimizations). However, vector database applications have dier-
ent requirements in the following three aspects, which motivates
us to restructure
Manu
from scratch with focuses on a cloud-native
architecture.
Support for complex transactions is not necessary. Instead
of decomposing entity representations into dierent elds or
tables, learning-based models encode complex and hybrid data
semantics into a single vector. As a result, multi-row or multi-
table transactions are not necessary; row-level ACID is sucient
for the majority of vector database applications.
A tunable performance-consistency trade-o is important.
Dierent users have dierent consistency requirements; some
users prefer high throughput and eventual consistency, while
others require some level of guaranteed consistency, i.e., newly
inserted data should be visible to queries either immediately or
within a pre-congured time. Traditional relational databases
generally support either strong consistency or eventual consis-
tency; there is little to no room for customization between these
3548
two extremes. As such, tunable consistency is a crucial attribute
for cloud-native vector databases.
High hardware cost calls for ne-grained elasticity. Some
vector database operations (e.g., vector search and index build-
ing) are computationally intensive, and hardware accelerators
(e.g. GPUs or FPGAs) and/or a large working memory are re-
quired for good performance. However, depending on application
types, workload diers amongst database functionalities. Thus,
resources can be wasted or improperly allocated if the vector
database does not have ne-grained elasticity. This necessitates
a careful decoupling of functional and hardware layers; system-
level decoupling such as separation of read and write logic is
insucient, elasticity and resource isolation should be managed
at the functionalities level rather than the system level.
In summary, modern vector databases should have tunable con-
sistency, functionality-level decoupling, and per-component scal-
ability. Following the design principles of traditional relational
databases makes achieving these design goals extremely dicult, if
not impossible. A key opportunity for achieving these design goals
lies in the potential for relaxing transaction complexity.
Manu
follows the “log as data” paradigm. Specically,
Manu
struc-
tures the entire system as a group of log publish/subscribe micro-
services. The write-ahead log (WAL) and inter-component mes-
sages are published as “logs", i.e., durable data streams that can be
subscribed. Read-side components, such as search and analytical
engines, are all built as log subscribers. This architecture provides
a simple yet eective way to decouple system functionalities; it
enables the decoupling of read from write, stateless from stateful,
and storage from computing. Each log entry is assigned a global
unique timestamp, and special log entries called time-tick (simi-
lar to watermarks in Apache Flink [
25
]) are periodically inserted
into each log channel signaling the progress of event-time for log
subscribers. The timestamp and time-tick form the basis of the
tunable consistency mechanism and multi-version consistency con-
trol (MVCC). To control the consistency level, a user can specify
a tolerable time lag between a query’s timestamp and the latest
time-tick consumed by a subscriber.
Additionally, we extensively optimize
Manu
for performance
and usability.
Manu
supports various indexes for vector search, in-
cluding vector quantization [
21
,
33
,
36
,
82
], inverted index [
23
], and
proximity graphs [
32
]. In particular, we tailor the implementations
to better utilize the parallelization capabilities of modern CPUs
and GPUs along with the improved read/write speeds of SSDs over
HDDs.
Manu
also integrates refactored functionalities from Mil-
vus [
80
], such as attribute ltering and multi-vector search. More-
over, build a visualization tool that allows users to track the perfor-
mance of
Manu
in real time and include an auto-conguration tool
that recommends indexing algorithm parameters using machine
learning.
To summarize, this paper makes the following contributions:
We summarize lessons learned from communicating with over
1200 industry users over three years. We shed light on typical
application requirements of vector databases and show how they
dier from those of traditional relational databases. We then
outline the key design goals that vector databases should meet.
We introduce
Manu
’s key architectural designs as a cloud native
vector database, building around the core design philosophy of
relaxing transaction complexity in exchange for tunable consis-
tency and ne-grained elasticity.
We present important usability and performance-related en-
hancements, e.g., high-level API, a GUI tool, automatic parameter
conguration, and SSD support.
The rest of the paper is organized as follows. Section 2 pro-
vides background on the requirements and design goals of vector
databases. Section 3 dives deep into
Manu
’s design. Section 4 high-
lights the key features for usability and performance. Section 5
discusses representative use cases for
Manu
. Section 6 review re-
lated works. Section 7 concludes the paper and outlines future
work.
2 BACKGROUND AND MOTIVATION
Consider video recommendation as a typical use case of vector
databases. The goal is to help users discover new videos based on
their personal preferences and previous browsing history. Using
machine learning models (especially deep neural networks), fea-
tures of users and videos, such as search history, watch history,
age, gender, video language, and tags are converted to embedding
vectors. These models are carefully designed and trained to encode
the similarity between user and video vectors into a common vec-
tor space. Recommendation is conducted by retrieving candidate
videos from the collection of video vectors via similarity scores
with respect to the specied user vector. The system also needs
to handle updates to vectors when new videos are updated, some
videos are deleted and the embedding model is changed.
Video recommendation and other applications of vector databases
can involve hundreds of billions of vectors with daily growth at
hundred-million scale, and serve million-scale queries per second
(QPS). Existing DBMSs (e.g., relational databases [
9
,
11
], NoSQL [
75
,
85
], NewSQL [
39
,
73
]) were not built to manage vector data on that
scale. Moreover, the underlying data management requirements of
their applications dier greatly from vector database applications.
First, when compared with relational databases, both the archi-
tecture and theory of vector databases are far from mature. A key
reason for this is that AI- and data-driven applications are still
in a state of constant evolution, thereby necessitating continued
architectural and functionality changes to vector databases as well.
Second, complex transactions are unnecessary for vector databases.
In the above example, the recommendation system encodes all se-
mantic features of users and videos into standalone vectors as
opposed to multi-row or multi-column entity elds in a relational
database. As a result, row-level ACID is sucient; multi-table oper-
ations (such as joins) are inessential.
Third, vector database applications need a exible performance-
consistency trade-o. While some applications adopt a strong or
eventual consistency model, there are others that fall between the
two extremes. Users may wish to relax consistency constraints in
exchange for better system throughput. In the video recommen-
dation example, observing a newly uploaded video after several
seconds is acceptable but keeping users waiting for recommenda-
tion harms user experience. Thus, the application can congure the
3549
of 14
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜