暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
APWeb-WAIM2022_Accelerated Algorithms for α-happiness Query_深算院_opt.pdf
229
16页
3次
2023-10-11
免费下载
Accelerated Algorithms for α-Happiness
Query
Min Xie
(
B
)
Shenzhen Institute of Computing Sciences, Shenzhen University, Shenzhen, China
xiemin@sics.ac.cn
Abstract. Extracting a good representative subset of tuples that meets
the user’s needs from a large database is an important problem in multi-
criteria decision making. Many queries have been proposed for this pur-
pose, including the top-k query and the skyline query. Unfortunately,
these traditional queries either ask the user to specify their needs explic-
itly or overwhelm users with a large output size. Recently, an α-happiness
query was proposed, which overcomes the deficiencies of existing queries:
users do not need to specify any preference, while they can obtain a small
set of tuples such that users are happy with the results, i.e., their favorite
tuples in the returned subset is guaranteed to be not much worse than
their favorite tuples in the whole database. In this paper, we study the
α-happiness query. Inspired by the techniques of incremental convex hull
computation, we develop two accelerated algorithms, which maintain use-
ful information to avoid redundant computation, in both 2-dimensional
and d-dimensional space (d>2). We performed extensive experiments,
comparing against the best-known method under various settings on
both real and synthetic datasets. Our superiority is demonstrated: we
can achieve up to two orders and 7 times of improvements in execution
times in 2-dimensional and d-dimensional space, respectively.
Keywords: α-happiness
· Incremental convex hull · Decision making
1 Introduction
Nowadays, a database system usually contains millions of tuples and end users
may only want to find those tuples that fit their needs. This problem is known
as multi-criteria decision making [5,18,19], and various queries were proposed
to obtain a small representative subset of tuples without asking the user to scan
the whole database. An example is the traditional top-k query [18,19], where a
user has to provide her preference function, called the utility function. Based on
the user’s utility function, the utility of each tuple for this user can be computed,
where a higher utility means that the tuple is more preferred. Finally, the best k
tuples with the highest utilities are returned. Unfortunately, requiring the user
to provide the exact utility function is too restrictive in many scenarios. In this
case, the skyline query can be applied [5], which adopts the “dominance” concept.
A tuple p is said to dominate another tuple q if p is not worse than q on each
c
The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
B. Li et al. (Eds.): APWeb-WAIM 2022, LNCS 13422, pp. 53–68, 2023.
https://doi.org/10.1007/978-3-031-25198-6
_5
54 M. Xie
attribute and p is better than q on at least one attribute. Intuitively, p will have a
higher utility than q w.r.t. all monotonic utility functions. Tuples which are not
dominated by any other tuples are returned in the skyline query. However, since
there is no constraint on the output size of a skyline query, a skyline query can
overwhelm the user with many results (at worst the whole database). Motivated
by this, a query called α-happiness query was studied recently in [27]toovercome
the deficiencies of both the top-k query (which requires the users to specify their
utility functions) and the skyline query (which might have a large output size).
Informally, an α-happiness query computes a set of tuples, with size as small
as possible, that makes the users happy where the degree of happiness is quanti-
fied as the happiness ratio of the user. Specifically, given a set of tuples, a user is
x% happy with the set if the highest utility of tuples in the set is at least x% of
the highest utility of all tuples in the whole database. In this case, we say that the
happiness ratio of the user is x%. Clearly, the happiness ratio is a value from 0 to
1. The larger the happiness ratio, the happier the user. The α-happiness query
guarantees the happiness ratio of an end user is at least α. In practice, more
tuples have to be returned to guarantee a higher happiness level, as exp ected.
However, with more tuples returned, users have to spend more effort to examine
the output, which is not desirable as they did in the skyline query. Hence, we
want the solution to be as small as possible, to ensure the given happiness level.
Consider a car database application. Assume that Alice wants to buy a car
from the car database where each car is described by two attributes, namely
horse power (HP) and miles per gallon (MPG). To help Alice for finding her
desired car, Alice can specify an α value, which represents the least happiness
level she expects. In practice, Alice can set α to be 0.9, indicating that she wants
a set of cars whose highest utility is at least 90% of the highest utility of all cars in
the database. Then, we execute the α-happiness query, which returns a small set
of cars from the database, hoping that Alice will be satisfied (since the happiness
ratio of Alice is at least α, as specified). However, if Alice is not satisfied with
those cars, she can increase the value of α, and execute the α-happiness query
again to obtain more cars with better quality to ensure a higher α.
Although it is NP-hard to solve the α-happiness query [27], various practical
algorithms were proposed in the literature. The best-known previous approach
for the α-happiness query is Cone-Greedy [27]. However, when we experimen-
tally evaluated Cone-Greedy, its execution time is unnecessarily long. This
is because Cone-Greedy did not keep sufficient information and thus, might
perform redundant computation, resulting in a long query time. The situation
is even worse when the user wants to perform multiple α-happiness queries with
different values of
α on the same database, which is common in reality since users
might adjust the value of α to obtain more/less tuples to fit their needs. Moti-
vated by this, we propose two novel approaches which accelerate Cone-Greedy
in both 2-dimensional and d-dimensional space (d>2). Our algorithms are
inspired by the incremental convex hull computation in computational geome-
try, and different from Cone-Greedy, they effectively maintain the information
needed during the computation and re-use them when necessary. Our exper-
iments show that our algorithms substantially outperform Cone-Greedy in
execution time. Our major contributions are summarized as follows:
of 16
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜