VLDB2024_FSM A Fine-grained Splitting and Merging Framework for Dual-balanced Graph Partition_gStore.pdf

墨天轮福利君

663

14页

3次

2024-09-05

免费下载

FSM: A Fine-grained Spliing and Merging Framework for

Dual-balanced Graph Partition

Chengjun Liu

School of Data Science

Fudan University, China

cjliu23@m.fudan.edu.cn

Zhuo Peng

School of Data Science

Fudan University, China

zpeng21@m.fudan.edu.cn

Weiguo Zheng

School of Data Science

Fudan University, China

zhengweiguo@m.fudan.edu.cn

Lei Zou

Wangxuan Institute of Computer Technology

Peking University, China

zoulei@pku.edu.cn

ABSTRACT

Partitioning a large graph into smaller subgraphs by minimizing the

number of cutting vertices and edges, namely cut size or replication

factor, plays a crucial role in distributed graph processing tasks.

However, many prior works have primarily focused on optimizing

the cut size by considering only vertex balance or edge balance,

leading to signicant workload imbalance and consequently hin-

dering the performance of downstream tasks. Therefore, in this

paper, we address the dual-balanced graph partition problem that

minimizes the cut size while simultaneously guaranteeing both

vertex and edge balance. We propose a lightweight eective two-

phase framework, namely ne-grained splitting and merging (FSM),

which decomposes the graph into more and smaller partitions and

then merges them. FSM oers the exibility of integrating with

various state-of-the-art single-balanced techniques. We develop

two ecient algorithms Fast Merging and Precise Merging to enable

trade-os between computational eciency and partitioning qual-

ity. Experimental results on large real-world graphs demonstrate

that FSM achieves state-of-the-art cut size while maintaining dual

balance. The runtime for downstream tasks PageRank, connected

component, and diameter estimation, can be reduced by a large

proportion, up to 9.43%, 11.35%, and 17.94%, respectively.

PVLDB Reference Format:

Chengjun Liu, Zhuo Peng, Weiguo Zheng, and Lei Zou. FSM: A

Fine-grained Splitting and Merging Framework for Dual-balanced Graph

Partition. PVLDB, 17(9): 2378 - 2391, 2024.

doi:10.14778/3665844.3665864

PVLDB Artifact Availability:

The source code, data, and/or other artifacts have been made available at

https://github.com/lcj2021/split-merge-partitioner/.

1 INTRODUCTION

The growing size of graphs presents signicant time and memory

challenges for algorithms like PageRank [

] and maximal clique

This work is licensed under the Creative Commons BY-NC-ND 4.0 International

License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of

this license. For any use beyond those covered by this license, obtain permission by

emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights

licensed to the VLDB Endowment.

Proceedings of the VLDB Endowment, Vol. 17, No. 9 ISSN 2150-8097.

doi:10.14778/3665844.3665864

Table 1: Vertex size imbalance of SOTA partitioners.

Alg.

/ hollywood indochina arabic

Graph B

𝑉

𝜎

𝑉

𝐸

𝑉

𝜎

𝑉

𝐸

R B

𝑉

𝜎

𝑉

𝐸

1.99 41.23% 1.00 1.53 3.12 73.29% 8.26 1.02 2.31 40.81% 1.00 1.04

HEP-100 1.94 42.61% 1.00 1.55 2.21 36.10% 1.00 1.06 1.90 30.57% 1.00 1.04

METIS 1.77 39.78% 1.03 4.59 2.52 71.93% 1.03 1.09 1.85 44.65% 1.03 1.14

enumeration [

]. To address these challenges, graph processing

frameworks such as Pregel [

], GraphX [

], and PowerGraph [

]

have been developed. These frameworks usually require eective

graph partitioning that divides the graph

𝐺

into a family of groups

of vertices and edges. There are two kinds of partitioning paradigms:

vertex partitioning and edge partitioning. Vertex partitioning (also

known as edge cutting) delivers the vertices of

𝐺

into pairwise

disjoint subsets by cutting the edges. In contrast, edge partitioning

(also known as vertex cutting) divides the edges of

𝐺

into pairwise

disjoint subsets by replicating (i.e., cutting) cross-boundary vertices.

The number of cutting edges and vertices is called cut size.

1.1 Motivation

Existing studies [

] have demonstrated that the com-

munication cost is positively correlated with the cut size. Addi-

tionally, the deviation between partitioned sets of vertices (known

as vertex balance) and the deviation between partitioned sets of

edges (known as edge balance) are essential for achieving work-

load balance [

]. Thus, lots of eorts have been made to reduce

the cut size while ensuring vertex balance [

] or edge

balance [

]. However, even the state-of-the-art (SOTA)

methods struggle to optimize these three metrics simultaneously.

Table 1 presents the performance of three representative partition-

ers including NE [

], HEP (

𝜏 =

100) [

], and METIS [

] over

real-world graphs downloaded from WebGraph [

–

], where each

graph is partitioned into

𝑝 =

32 parts. Let

𝑉

(resp.

𝐸

) denote the

vertex balance (resp. edge balance), i.e., the fraction of the largest

vertex (resp. edge) size over the average vertex (resp. edge) parti-

tion size,

𝜎

𝑉

denote the coecient variation of vertex sizes across

partitions, and R denote the replication factor (i.e., the cut size).

As shown in Table 1, we can observe that all three partitioners

achieve promising replication factors and edge balances, but (I) ex-

hibit very poor vertex balances, which suggests that they generate

2378

0 5 10 15 20

PageRank (sec)

1.5s 2.6s

9.0M

157.8M

8.9M

147.6M

8.9M

156.6M

7.4M

98.9M

9.1M

157.8M

9.0M

153.5M

9.0M

157.6M

7.4M

117.0M

9.5M

157.8M

9.4M

156.6M

9.4M

157.7M

7.6M

122.1M

12.5M

157.8M

12.3M

156.5M

12.4M

157.7M

10.8M

141.2M

0 5 10 15 20 25

Connected Component (sec)

2.5s 3.1s

|V|:9.0M

|E|:315.6M

|V|:8.9M

|E|:315.4M

|V|:9.1M

|E|:315.6M

|V|:9.1M

|E|:315.5M

|V|:9.5M

|E|:315.6M

|V|:9.5M

|E|:315.5M

|V|:12.5M

|E|:315.6M

|V|:12.4M

|E|:315.3M

Gather Scatter

Figure 1: Elapsed time of gather and scatter operations on

PowerGraph using NE partitioner.

excessively large partitions; (II) the sizes of these partitions are

highly skewed, indicating the existence of partitions with a signi-

cantly large number of vertices, as well as partitions with very few

vertices. Focusing on only one single balance while minimizing the

replication factor can lead to two troublesome problems as follows.

(1) Terrible Workload Imbalance. In Bulk Synchronous Parallel

(BSP) based systems, a global synchronization checkpoint is set

between two iterations. Thus, the machine with the largest work-

load becomes the performance bottleneck. Figure 1 illustrates the

elapsed time of gather and scatter operations on PowerGraph [

]

for the top two iterations of the PageRank and Connected Com-

ponent tasks on four machines (M1, M2, M3, and M4), along with

the involved numbers of vertices and edges. While the number of

involved edges across the four machines is roughly consistent in

the rst iteration, there exists a notable discrepancy in elapsed time.

Discrepancies in the number of vertices can lead to dierences in

cache hit rates, as well as disparities in the numbers of involved

edges in the subsequent iteration. For example, in the second itera-

tion of PageRank, the discrepancies on the involved edges become

signicant in the scatter phase, thereby resulting in variations in

machine elapsed time.

(2) Excessive Memory Consumption. In distributed graph com-

puting frameworks deployed over homogeneous and memory-

constrained clusters, each machine needs to maintain necessary

information, such as neighbor sets and PageRank values for ver-

tices. The excessively large partitions due to vertex imbalance will

signicantly increase memory consumption. As for PowerGraph,

the partition with the largest number of vertices becomes the mem-

ory bottleneck. As depicted in Table 1, for the graph hollywood,

the largest partition delivered by METIS has reached 8 times the

average size. The machine hosting such a large partition is highly

susceptible to encountering memory bottlenecks, which can result

in downstream task failures.

Therefore, it is crucial to minimize the replication factor while

guaranteeing both vertex and edge balance. However, dual-balanced

graph partitioning has not yet received sucient attention and

research despite its importance.

1.2 Challenges and Contributions

Challenge 1: Intractable Complexity. Previous study [

] has estab-

lished the graph partitioning problem, minimizing the replication

factor such that the edge balance is bounded by a constant, which

is an NP-hard problem. The inherent complexity coupled with large

graphs with billions of vertices and edges already poses signicant

challenges in terms of time and memory. Dual-balanced graph par-

titioning problem generalizes the single-balanced graph partition

by introducing an additional constraint, making it more dicult to

nd an optimal solution that satises both balances simultaneously.

Therefore, the partitioning algorithm itself is expected to be light-

weight and ecient. These requirements force us to avoid using

overly complex algorithms and redundant data structures.

Challenge 2: Skewed Degrees. Most real-world graphs follow the

power-law distribution, revealing a skewed degree distribution

that the majority of vertices have low degrees, while only a small

subset of vertices are highly connected, known as hub vertices. Hub

vertices play a crucial role in inuencing the density of partitions,

posing a signicant challenge in achieving dual balance.

Existing dual-balanced partitioners [

] have made eorts

to address these challenges. EBV achieves dual balance by incor-

porating a scoring function that considers both vertex and edge

loads. BPart modies FENNEL [

] and requires simultaneous opti-

mization of both vertex and edge balance. MDBGP [

] transforms

the multi-dimensional balance partitioning problem into a mathe-

matical optimization problem and solves it using gradient descent,

but it constrains the number of partitions to powers of two. Be-

yond that, the optimization process of MDBGP involves multiple

rounds of

O(𝑛

)

intersection point calculations, which is not feasi-

ble for partitioning large graphs. To summarize, these partitioners

(I) suer from poor replication factors, resulting in a substantial

communication cost that becomes a bottleneck again, and (II) do

not support constraints on vertex-and-edge balance parameters or

fail to achieve the desired constraints. There is a notable imbalance

in both the vertex and edge dimensions for many graphs.

Contributions. To realize dual-balanced graph partitioning, in this

paper, we propose the

ine-grained

plitting and

erging (FSM)

framework. The underlying principle behind the FSM framework is

to deal with vertex balance and edge balance incrementally, ensur-

ing them one by one. The FSM framework consists of two phases,

i.e., ne-grained splitting and subgraph merging.

Specically, in the ne-grained splitting phase, FSM performs

primary exploration of the graph by decomposing it into small-

size subgraphs. The aim is to group the highly correlated vertices

and edges together, producing a family of fundamental subgraphs.

Moreover, in this phase, we can just concentrate on one balance at

a time while minimizing replication factors. As a result, the frame-

work becomes highly versatile and powerful, making it possible to

integrate with various state-of-the-art single-balance techniques.

In the subgraph merging phase, we assemble the subgraphs

produced above into larger partitions. Thus, we can consider the

partitioning problem from a broader perspective, enabling us to

neutralize the imbalances eectively and enhance the overall per-

formance of the partitioning process. In particular, we formulate

a subgraph allocation problem and develop two lightweight and

eective merging algorithms, Fast Merging and Precise Merging.

In summary, our contributions can be summarized as follows:

(1)

We formulate the problem of dual-balanced graph partitioning

that minimizes the replication factor while both vertex balance

and edge balance are guaranteed.

2379

of 14

免费下载

文档被以下合辑收录

VLDB2024 数据库顶会论文（共31篇）

本合辑收录了VLDB2024 数据库顶会论文。

关注

文档被以下合辑收录

评论