
NeutronStream: A Dynamic GNN Training Framework with
Sliding Window for Graph Streams
Chaoyi Chen
Northeastern
University, China
chenchaoy@
stumail.neu.edu.cn
Dechao Gao
Northeastern
University, China
gaodechao@
stumail.neu.edu.cn
Yanfeng Zhang
Northeastern
University, China
zhangyf@
mail.neu.edu.cn
Qiange Wang
Northeastern
University, China
wangqiange@
stumail.neu.edu.cn
Zhenbo Fu
Northeastern
University, China
fuzhenbo@
stumail.neu.edu.cn
Xuecang Zhang
Huawei Technologies
Co., Ltd.
zhangxuecang@
huawei.com
Junhua Zhu
Huawei Technologies
Co., Ltd.
junhua.zhu@
huawei.com
Yu Gu
Northeastern
University, China
guyu@
mail.neu.edu.cn
Ge Yu
Northeastern
University, China
yuge@
mail.neu.edu.cn
ABSTRACT
Existing Graph Neural Network (GNN) training frameworks have
been designed to help developers easily create performant GNN
implementations. However, most existing GNN frameworks assume
that the input graphs are static, but ignore that most real-world
graphs are constantly evolving. Though many dynamic GNN mod-
els have emerged to learn from evolving graphs, the training process
of these dynamic GNNs is dramatically dierent from traditional
GNNs in that it captures both the spatial and temporal dependencies
of graph updates. This poses new challenges for designing dynamic
GNN training frameworks. First, the traditional batched training
method fails to capture real-time structural evolution information.
Second, the time-dependent nature makes parallel training hard to
design. Third, it lacks system supports for users to eciently im-
plement dynamic GNNs. In this paper, we present NeutronStream,
a framework for training dynamic GNN models. NeutronStream
abstracts the input dynamic graph into a chronologically updated
stream of events and processes the stream with an optimized sliding
window to incrementally capture the spatial-temporal dependen-
cies of events. Furthermore, NeutronStream provides a parallel
execution engine to tackle the sequential event processing chal-
lenge to achieve high performance. NeutronStream also integrates
a built-in graph storage structure that supports dynamic updates
and provides a set of easy-to-use APIs that allow users to express
their dynamic GNNs. Our experimental results demonstrate that,
compared to state-of-the-art dynamic GNN implementations, Neu-
tronStream achieves speedups ranging from 1.48X to 5.87X and an
average accuracy improvement of 3.97%.
PVLDB Reference Format:
Chaoyi Chen, Dechao Gao, Yanfeng Zhang, Qiange Wang, Zhenbo Fu,
Xuecang Zhang, Junhua Zhu, Yu Gu, Ge Yu. NeutronStream: A Dynamic
GNN Training Framework with Sliding Window for Graph Streams.
PVLDB, 17(3): 455 - 468, 2023.
doi:10.14778/3632093.3632108
This work is licensed under the Creative Commons BY-NC-ND 4.0 International
License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of
this license. For any use beyond those covered by this license, obtain permission by
emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights
licensed to the VLDB Endowment.
Proceedings of the VLDB Endowment, Vol. 17, No. 3 ISSN 2150-8097.
doi:10.14778/3632093.3632108
PVLDB Artifact Availability:
The source code, data, and/or other artifacts have been made available at
https://github.com/iDC-NEU/NeutronStream.
1 INTRODUCTION
Graph Neural Networks (GNNs) [
6
,
18
,
26
,
30
,
52
,
60
,
62
,
64
,
69
] are
a class of deep learning models designed to learn from graph data.
GNNs have been widely adopted in various graph applications, in-
cluding social networks analytics [
8
,
59
], recommendation systems
[
35
,
67
], and knowledge graphs [
39
,
49
]. Most of the existing GNN
models assume that the input graph is static. However, real-world
graphs are inherently dynamic and evolving over time. Recently,
many dynamic GNN models [
20
,
27
,
28
,
34
,
45
,
51
,
57
,
63
,
79
] are
emerged as a promising method for learning from dynamic graphs.
These models capture both the spatial and temporal information,
which makes them outperform traditional GNNs in real-time ap-
plications, such as real-time fraud detection [
57
], real-time recom-
mendation [20], and many other tasks.
In dynamic GNNs, the dynamic graph is modeled as a sequence
of time-stamped events, each event representing a graph update
operation. Each event is associated with a timestamp indicating
when it occurs and an update type, e.g., an addition/deletion of a
node/edge or an update of a node/edge’s feature. Dynamic GNNs
encode the information of each event into dynamic node embed-
dings chronologically. The training process of dynamic GNNs is
dramatically dierent from traditional GNNs in that it has to con-
sider the temporal dependency of events. Existing dynamic GNNs
[
19
,
28
,
51
] are implemented on top of general DNN training frame-
works, e.g., Tensorow [
5
] and PyTorch [
41
]. However, the complex
spatial-temporal dependencies among events pose new challenges
for designing dynamic GNN frameworks.
First, the traditional batched training mode adopted by exist-
ing DNN frameworks may fail to capture the real-time structural
evolution information. Batched training mode periodically packs
new arrival events into a training batch and trains the model using
these batches incrementally. However, this method forcibly cuts
o the stream and ignores the spatial locality between events in
two consecutive batches, which may lead to a decrease in model
accuracy. Figure 1(a) illustrates a motivating example on a dynamic
social network graph, which contains six consecutive interaction
455
文档被以下合辑收录
评论