two extremes. As such, tunable consistency is a crucial attribute
for cloud-native vector databases.
•
High hardware cost calls for ne-grained elasticity. Some
vector database operations (e.g., vector search and index build-
ing) are computationally intensive, and hardware accelerators
(e.g. GPUs or FPGAs) and/or a large working memory are re-
quired for good performance. However, depending on application
types, workload diers amongst database functionalities. Thus,
resources can be wasted or improperly allocated if the vector
database does not have ne-grained elasticity. This necessitates
a careful decoupling of functional and hardware layers; system-
level decoupling such as separation of read and write logic is
insucient, elasticity and resource isolation should be managed
at the functionalities level rather than the system level.
In summary, modern vector databases should have tunable con-
sistency, functionality-level decoupling, and per-component scal-
ability. Following the design principles of traditional relational
databases makes achieving these design goals extremely dicult, if
not impossible. A key opportunity for achieving these design goals
lies in the potential for relaxing transaction complexity.
Manu
follows the “log as data” paradigm. Specically,
Manu
struc-
tures the entire system as a group of log publish/subscribe micro-
services. The write-ahead log (WAL) and inter-component mes-
sages are published as “logs", i.e., durable data streams that can be
subscribed. Read-side components, such as search and analytical
engines, are all built as log subscribers. This architecture provides
a simple yet eective way to decouple system functionalities; it
enables the decoupling of read from write, stateless from stateful,
and storage from computing. Each log entry is assigned a global
unique timestamp, and special log entries called time-tick (simi-
lar to watermarks in Apache Flink [
25
]) are periodically inserted
into each log channel signaling the progress of event-time for log
subscribers. The timestamp and time-tick form the basis of the
tunable consistency mechanism and multi-version consistency con-
trol (MVCC). To control the consistency level, a user can specify
a tolerable time lag between a query’s timestamp and the latest
time-tick consumed by a subscriber.
Additionally, we extensively optimize
Manu
for performance
and usability.
Manu
supports various indexes for vector search, in-
cluding vector quantization [
21
,
33
,
36
,
82
], inverted index [
23
], and
proximity graphs [
32
]. In particular, we tailor the implementations
to better utilize the parallelization capabilities of modern CPUs
and GPUs along with the improved read/write speeds of SSDs over
HDDs.
Manu
also integrates refactored functionalities from Mil-
vus [
80
], such as attribute ltering and multi-vector search. More-
over, build a visualization tool that allows users to track the perfor-
mance of
Manu
in real time and include an auto-conguration tool
that recommends indexing algorithm parameters using machine
learning.
To summarize, this paper makes the following contributions:
•
We summarize lessons learned from communicating with over
1200 industry users over three years. We shed light on typical
application requirements of vector databases and show how they
dier from those of traditional relational databases. We then
outline the key design goals that vector databases should meet.
•
We introduce
Manu
’s key architectural designs as a cloud native
vector database, building around the core design philosophy of
relaxing transaction complexity in exchange for tunable consis-
tency and ne-grained elasticity.
•
We present important usability and performance-related en-
hancements, e.g., high-level API, a GUI tool, automatic parameter
conguration, and SSD support.
The rest of the paper is organized as follows. Section 2 pro-
vides background on the requirements and design goals of vector
databases. Section 3 dives deep into
Manu
’s design. Section 4 high-
lights the key features for usability and performance. Section 5
discusses representative use cases for
Manu
. Section 6 review re-
lated works. Section 7 concludes the paper and outlines future
work.
2 BACKGROUND AND MOTIVATION
Consider video recommendation as a typical use case of vector
databases. The goal is to help users discover new videos based on
their personal preferences and previous browsing history. Using
machine learning models (especially deep neural networks), fea-
tures of users and videos, such as search history, watch history,
age, gender, video language, and tags are converted to embedding
vectors. These models are carefully designed and trained to encode
the similarity between user and video vectors into a common vec-
tor space. Recommendation is conducted by retrieving candidate
videos from the collection of video vectors via similarity scores
with respect to the specied user vector. The system also needs
to handle updates to vectors when new videos are updated, some
videos are deleted and the embedding model is changed.
Video recommendation and other applications of vector databases
can involve hundreds of billions of vectors with daily growth at
hundred-million scale, and serve million-scale queries per second
(QPS). Existing DBMSs (e.g., relational databases [
9
,
11
], NoSQL [
75
,
85
], NewSQL [
39
,
73
]) were not built to manage vector data on that
scale. Moreover, the underlying data management requirements of
their applications dier greatly from vector database applications.
First, when compared with relational databases, both the archi-
tecture and theory of vector databases are far from mature. A key
reason for this is that AI- and data-driven applications are still
in a state of constant evolution, thereby necessitating continued
architectural and functionality changes to vector databases as well.
Second, complex transactions are unnecessary for vector databases.
In the above example, the recommendation system encodes all se-
mantic features of users and videos into standalone vectors as
opposed to multi-row or multi-column entity elds in a relational
database. As a result, row-level ACID is sucient; multi-table oper-
ations (such as joins) are inessential.
Third, vector database applications need a exible performance-
consistency trade-o. While some applications adopt a strong or
eventual consistency model, there are others that fall between the
two extremes. Users may wish to relax consistency constraints in
exchange for better system throughput. In the video recommen-
dation example, observing a newly uploaded video after several
seconds is acceptable but keeping users waiting for recommenda-
tion harms user experience. Thus, the application can congure the
3549
评论