
Users
Non-empty
Machines
Empty
Machines
Offline
Machines
Suppliers
Cluster
Figure 1: Three-stage resource ow.
1. Frequent and variant resource requests. The database work-
load varies in response to changes in the user’s data size, data
distribution, query types, or more generally, application-layer char-
acteristics. Unlike traditional database services, where the users
choose the conguration of their database instances, and revisit it
on a weekly or even monthly basis, decisions on scaling up/down
are made every several seconds or minutes.
2. Low tolerance for delayed resource allocation. Resource
requests often reect the necessary resources to retain SLAs (e.g.,
query throughput and latency), and therefore, are highly time-
sensitive. Resource allocation decisions and the actual allocation
need to be made within strong latency (e.g., sub-second) guarantees.
The rst challenge means that the problem of resource stranding (in
the best-t scheduling algorithm) becomes more prominent, which
further triggers a decrease in the allocation availability. On the
other hand, an even resource distribution (in the worst-t schedul-
ing algorithm) will have a penalty on large resource requests: their
accommodation requires a large portion of resources in a single ma-
chine, which is not readily available and requires time-consuming
migration or adding new machines to the cluster. Meanwhile, al-
location availability also depends on the cluster size (i.e., the total
amount of resources we can provide), and it is impractical to add
new machines from the supply chain instantaneously. This drives
service providers to deliberately overestimate cluster size, resulting
in wasted power consumption of underutilized machines, and an
increased carbon emission footprint.
Addressing these challenges, we adopt a cascading resource ow
model that divides nodes into three types (shown in Figure 1): non-
empty nodes, empty nodes, and oine nodes. To simultaneously
maximize the resource allocation ratio and resource availability,
we believe resources in the resource ow should be simultaneously
optimized (i.e., end-to-end resource optimization):
•
Resource optimization of non-empty machines. We discover
two major challenges when adopting classic heuristic bin packing
algorithms for resource scheduling of non-empty machines. First,
cloud database instances require multi-dimensional and hetero-
geneous resources, which makes resource scheduling complex.
Suboptimal allocations can cause skewness of resource dimen-
sions and consequential stranded resources. Second, migration
cost is not negligible when clusters are consolidated. We discuss
optimal consolidation solutions under migration cost constraints.
•
Resource optimization of empty machines. To guarantee
high resource availability, especially with the strict latency con-
straint (see Challenge 2), it is reasonable to maintain a pool of
empty machines as a safety net to handle requests that cannot
be accommodated in the non-empty machines. However, it is
a waste, in terms of power consumption, to maintain a large
pool of online machines while a signicant portion of it is empty
machines. We, therefore, want to maintain just enough empty
machines online to avoid degradation in resource availability.
•
Resource optimization of oline machines. Oine machines
are a shared resource pool for all database products. From a cloud
vendor’s perspective, the maintenance and optimization of oine
machines are also important (from the perspective of operational
cost), but rarely discussed. In practice, cloud vendors purchase
machines from suppliers, and the whole process, from placing
an order to having the machines delivered, typically takes weeks
to months. It is challenging to evaluate the optimal size of oine
machines over a long-term period.
Based on our experiences in resource management for Alibaba’s
database services, we build Eigen, a large-scale, cloud-native cluster
management a system that features end-to-end resource optimiza-
tion. We summarize the contributions of this paper are as follows:
•
Based on the resource ow model, we propose a hierarchical
resource management system which enables three novel resource
optimization algorithms: 1. Vectorized Resource Optimization,
a heuristic bin packing algorithm that consolidates non-empty
machines in the course of resources allocation; 2. Exponential
Smoothing with Smoothed Adaptive Margins which proactively
scales up/down empty machines in a short-term period; 3. Tem-
poral CNN (TCN) with Minimum-stock Policy which optimizes
the number of oine machines in the long-term period.
•
We introduce optimizations for fast scheduling, which include
master-agent collaborative scheduling and cold instance eviction.
•
We evaluate the proposed algorithms, and the overall perfor-
mance of Eigen on large-scale production clusters with real-
world workloads. The evaluation results show that the proposed
algorithms signicantly increase resource allocation ratios of
cloud databases with a negligible rise in the fail/delayed ratio of
allocation requests.
2 BACKGROUND AND MOTIVATION
2.1 Resource Scheduling and Optimization
Resource scheduler is a fundamental component in cluster manage-
ment systems, responsible for allocating resources (such as CPUs,
Memory, Disks) to jobs. Notable examples include Kubernetes’
Kube-scheduler [14], Twine [28]’s allocator, Borg [30]’s scheduler.
Take the Kube-scheduler for instance, it is responsible for assigning
a newly created pod to a machine according to some predetermined
scheduling policy. Kube-scheduler consists of two steps: ltering
and scoring. The ltering step searches all feasible machines to
schedule a pod, and the scoring step chooses the machine deemed
most suitable for the pod (based on the scores calculated using a
scoring strategy). It allows users to customize scoring strategies for
dierent resource scheduling algorithms.
Resource optimization usually aims to maximize resource alloca-
tion ratios (or utilization ratios). Among the previous works, there
are roughly two types of resource optimization approaches. The
rst reduces the resource optimization to the classic bin packing
problem. For example, the Kube-scheduler provides two scoring
strategies (both are variants of best-t) to support bin packing
of resources [
15
]. Another approach utilizes statistical and ma-
chine learning techniques, notably time series forecasting, to im-
prove utilization ratios by predicting future workloads and reclaim-
ing underutilized resources. For example, Autopilot [
24
] uses an
3796
文档被以下合辑收录
评论