54 M. Xie
attribute and p is better than q on at least one attribute. Intuitively, p will have a
higher utility than q w.r.t. all monotonic utility functions. Tuples which are not
dominated by any other tuples are returned in the skyline query. However, since
there is no constraint on the output size of a skyline query, a skyline query can
overwhelm the user with many results (at worst the whole database). Motivated
by this, a query called α-happiness query was studied recently in [27]toovercome
the deficiencies of both the top-k query (which requires the users to specify their
utility functions) and the skyline query (which might have a large output size).
Informally, an α-happiness query computes a set of tuples, with size as small
as possible, that makes the users happy where the degree of happiness is quanti-
fied as the happiness ratio of the user. Specifically, given a set of tuples, a user is
x% happy with the set if the highest utility of tuples in the set is at least x% of
the highest utility of all tuples in the whole database. In this case, we say that the
happiness ratio of the user is x%. Clearly, the happiness ratio is a value from 0 to
1. The larger the happiness ratio, the happier the user. The α-happiness query
guarantees the happiness ratio of an end user is at least α. In practice, more
tuples have to be returned to guarantee a higher happiness level, as exp ected.
However, with more tuples returned, users have to spend more effort to examine
the output, which is not desirable as they did in the skyline query. Hence, we
want the solution to be as small as possible, to ensure the given happiness level.
Consider a car database application. Assume that Alice wants to buy a car
from the car database where each car is described by two attributes, namely
horse power (HP) and miles per gallon (MPG). To help Alice for finding her
desired car, Alice can specify an α value, which represents the least happiness
level she expects. In practice, Alice can set α to be 0.9, indicating that she wants
a set of cars whose highest utility is at least 90% of the highest utility of all cars in
the database. Then, we execute the α-happiness query, which returns a small set
of cars from the database, hoping that Alice will be satisfied (since the happiness
ratio of Alice is at least α, as specified). However, if Alice is not satisfied with
those cars, she can increase the value of α, and execute the α-happiness query
again to obtain more cars with better quality to ensure a higher α.
Although it is NP-hard to solve the α-happiness query [27], various practical
algorithms were proposed in the literature. The best-known previous approach
for the α-happiness query is Cone-Greedy [27]. However, when we experimen-
tally evaluated Cone-Greedy, its execution time is unnecessarily long. This
is because Cone-Greedy did not keep sufficient information and thus, might
perform redundant computation, resulting in a long query time. The situation
is even worse when the user wants to perform multiple α-happiness queries with
different values of
α on the same database, which is common in reality since users
might adjust the value of α to obtain more/less tuples to fit their needs. Moti-
vated by this, we propose two novel approaches which accelerate Cone-Greedy
in both 2-dimensional and d-dimensional space (d>2). Our algorithms are
inspired by the incremental convex hull computation in computational geome-
try, and different from Cone-Greedy, they effectively maintain the information
needed during the computation and re-use them when necessary. Our exper-
iments show that our algorithms substantially outperform Cone-Greedy in
execution time. Our major contributions are summarized as follows:
评论