暂无图片
暂无图片
暂无图片
暂无图片
暂无图片
VLDB2022_Query Driven-Graph Neural Networks for Community Search:From Non-Attributed, Attributed, to Interactive Attributed_腾讯云数据库.pdf
188
13页
3次
2023-08-29
免费下载
ery Driven-Graph Neural Networks for Community Search:
From Non-Aributed, Aributed, to Interactive Aributed
Yuli Jiang
1
, Yu Rong
2
, Hong Cheng
3
, Xin Huang
4
, Kangfei Zhao
5
, Junzhou Huang
6
The Chinese University of Hong Kong,
Tencent AI Lab,
Hong Kong Baptist University, China
{
1
yljiang,
3
hcheng,
5
kfzhao}@se.cuhk.edu.hk,
2
yu.rong@hotmail.com,
4
xinhuang@comp.hkbu.edu.hk,
5
zkf1105@gmail.com,
6
jzhuang@uta.edu
ABSTRACT
Given one or more query vertices, Community Search (CS) aims
to nd densely intra-connected and loosely inter-connected struc-
tures containing query vertices. Attributed Community Search
(ACS), a related problem, is more challenging since it nds com-
munities with both cohesive structures and homogeneous vertex
attributes. However, most methods for the CS task rely on inexible
pre-dened structures and studies for ACS treat each attribute inde-
pendently. Moreover, the most popular ACS strategies decompose
ACS into two separate sub-problems, i.e., the CS task and subse-
quent attribute ltering task. However, in real-world graphs, the
community structure and the vertex attributes are closely corre-
lated to each other. This correlation is vital for the ACS problem. In
this vein, we argue that the separation strategy cannot fully capture
the correlation between structure and attributes simultaneously
and it would compromise the nal performance.
In this paper, we propose Graph Neural Network (GNN) models
for both CS and ACS problems, i.e., Query Driven-GNN (
QD
-
GNN
)
and Attributed Query Driven-GNN (
AQD
-
GNN
). In
QD
-
GNN
, we
combine the local query-dependent structure and global graph
embedding. In order to extend
QD
-
GNN
to handle attributes, we
model vertex attributes as a bipartite graph and capture the relation
between attributes by constructing GNNs on this bipartite graph.
With a Feature Fusion operator,
AQD
-
GNN
processes the structure
and attribute simultaneously and predicts communities according
to each attributed query. Experiments on real-world graphs with
ground-truth communities demonstrate that the proposed mod-
els outperform existing CS and ACS algorithms in terms of both
eciency and eectiveness. More recently, an interactive setting
for CS is proposed that allows users to adjust the predicted com-
munities. We further verify our approaches under the interactive
setting and extend to the attributed context. Our method achieves
2
.
37% and 6
.
29% improvements in F1-score than the state-of-the-art
model without attributes and with attributes respectively.
PVLDB Reference Format:
Yuli Jiang, Yu Rong, Hong Cheng, Xin Huang, Kangfei Zhao, Junzhou
Huang. Query Driven-Graph Neural Networks for Community Search:
From Non-Attributed, Attributed, to Interactive Attributed. PVLDB, 15(6):
XXX-XXX, 2022.
doi:10.14778/3514061.3514070
This work is licensed under the Creative Commons BY-NC-ND 4.0 International
License. Visit https://creativecommons.org/licenses/by-nc-nd/4.0/ to view a copy of
this license. For any use beyond those covered by this license, obtain permission by
emailing info@vldb.org. Copyright is held by the owner/author(s). Publication rights
licensed to the VLDB Endowment.
Proceedings of the VLDB Endowment, Vol. 15, No. 6 ISSN 2150-8097.
doi:10.14778/3514061.3514070
1 INTRODUCTION
Graph is an essential data structure to represent entities and their
relationships, e.g., social networks, protein-protein interaction net-
works, web graphs, and knowledge graphs, to name a few. Com-
munity, a subgraph of densely intra-connected and loosely inter-
connected structure, naturally exists as a functional module in real-
world graphs. Community Search (CS) [
7
,
11
,
19
,
21
,
25
,
37
] is a vital
application in graph analytics. Concretely, given any query vertices,
CS aims to nd a vertex set with cohesive structure according to the
query, i.e., query-dependent communities. Attributed Community
Search (ACS), a related but more challenging problem, has attracted
a lot of attention recently [
10
,
11
,
22
,
23
,
25
]. Given any query vertex
and attribute set, ACS aims at nding query-dependent commu-
nities with homogeneous attributes, which means the community
members share similar attributes with the query attributes.
For the CS and ACS problems, existing studies suer from two
serious limitations, that is, structure inflexibility and aribute
irrelevance
. Structure inexibility refers to the problem that most
community search models are based on a pre-dened subgraph
pattern, such as
𝑘
-core [
8
,
10
,
37
],
𝑘
-truss [
1
,
21
,
22
],
𝑘
-clique [
7
,
44
],
and
𝑘
-edge connected component (
ECC
) [
6
,
19
]. The pre-dened
subgraph pattern imposes a very rigid requirement on topological
structure of communities, which may not perfectly hold in real-
world communities. Attribute irrelevance means existing models
treat each attribute independently [
10
,
22
]. However, in real graphs,
vertex attributes are not independent of each other. Ignoring such
implicit relations would harm the quality of queried communities.
Figure 1 depicts a toy example illustrating the limitation of exist-
ing algorithms. The faculty hierarchy is a tree-like structure from
the faculty dean, department chairman to the professors in each
department. Using existing methods based on pre-dened subgraph
patterns, we can only nd a 1-core community of vertex 6 in
𝐻
1
and a 2-truss community in
𝐻
2
, which are the entire graph. These
𝑘
-core [
37
] and
𝑘
-truss [
21
] patterns cannot discover the tree-like
department communities owing to the structure inexibility. For
attributed community search, when querying the community of
vertex 6 and attribute “ML”, current methods [
10
,
22
] nd the com-
munity
𝐻
3
since they ignore the implicit relations between “ML”,
“DL” and “CV”. Thus, existing studies suer from these two inade-
quacies on structure and attribute respectively.
Moreover, for the ACS problem, existing studies [
10
,
22
] usually
adopt a two-stage strategy which rst nds the candidate com-
munity by considering the topological structure only, and then
performs a ltering on the candidate community by considering
the attribute similarity. The two-stage strategy treats the structure
cohesiveness and attribute homogeneity separately. But there is usu-
ally a correlation between structure and attribute, for instance, in
arXiv:2104.03583v2 [cs.DB] 23 Mar 2022
Figure 1: An attributed graph depicting a faculty hierarchy
with two departments: Dept.CS and Dept.IE. Attributes rep-
resent research topics. For vertex
6
, there is a ground-truth
Dept.IE community (shown in blue) as vertices
6
8
are close
and work on similar topics. On the right, in response to
queries on each arrow, there are three result communities
found by existing algorithms, which are quite dierent from
the ground-truth community.
protein-protein interaction networks, proteins with similar func-
tions (i.e., attributes) are more likely to interact with each other
[
39
]. Independently dealing with the structure and attribute would
harm the quality of queried communities.
Inspired by the success of Graph Neural Network (GNN) [
27
]
on combining attribute and structure in many graph problems,
Gao et al. [
14
] proposed a GNN-based framework, ICS-GNN, to
solve the community search problem in an interactive fashion (i.e.,
users can adjust predicted communities during the query process).
Specically, it enhances the non-attributed queries by the GNN
model [
27
] which exploits the information from the existing vertex
attributes in graphs. However, for every query, ICS-GNN re-trains
the whole model. This re-training process is time-consuming and
hinders its applications in real-world scenarios, especially for the
online query case. On the other hand, even though ICS-GNN makes
use of the attributes to enhance the community search performance,
its model architecture cannot accept the query attributes as input.
Therefore, ICS-GNN cannot be extended to support interactive
attributed community search easily.
To address the above limitations, in this paper, we propose GNN-
based models for both CS and ACS problems. For the CS problem,
to address the structure inexibility issue, we design a two-branch
model: Query Driven-GNN (
QD
-
GNN
) to encode the information
from both the query and graph. Concretely,
QD
-
GNN
contains
two encoders, Query Encoder and Graph Encoder. Query Encoder
encodes the structural information from query vertices and focuses
on modeling the local topology around the queries. Graph Encoder
combines the global structure and attributes to learn the query-
independent node embeddings. As a learning-based model,
QD
-
GNN
can search communities without imposing any restriction on
the community structure. Furthermore, we design an additional
Attribute Encoder to extend
QD
-
GNN
to support the attributed com-
munity search. Attribute Encoder exploits a node-attribute bipartite
graph to model the attribute relations and can encode more mean-
ingful information from the attribute space. To process structure
and attribute simultaneously, we employ a Feature Fusion compo-
nent to fuse the information from dierent encoders and make the
nal output. Furthermore, we design a new query framework which
detaches the model training from the online query stage. There-
fore, our framework does not need the time-consuming re-training
phase for online query applications.
To summarize, we make the following contributions.
We propose a Query Driven-GNN model (
QD
-
GNN
) for commu-
nity search, which combines the local query-dependent structure
and global node embeddings. Given any query,
QD
-
GNN
only
needs a model inference step and avoids the time-consuming
re-training.
To the best of our knowledge, this is the rst work that proposes a
GNN model for the attributed community search problem, called
Attributed Query Driven-GNN (
AQD
-
GNN
). Our novel learning
framework extends GNN into ACS through a node-attribute bi-
partite graph, and learns the community information from both
the local structure and similar attributes of queries.
We conduct extensive experiments on real-world data sets with
ground-truth communities for performance evaluation. Exper-
iments demonstrate that our model signicantly outperforms
state-of-the-art methods in terms of community quality with
only 4.31 milliseconds average response time.
We apply
AQD
-
GNN
to the interactive community search prob-
lem and extend it into the attributed context. Experiments show
that our models can improve the performance of ICS-GNN [
14
]
in both non-attributed and attributed manner with 2
.
37% and
6.29% improvements in F1-score respectively.
Roadmap
. The rest of the paper is organized as follows. Section 2
discusses related work. Section 3 gives some preliminaries. Section
4 presents the common framework of the proposed models. Section
5 introduces the
QD
-
GNN
model for community search problem,
and Section 6 describes the
AQD
-
GNN
model for attributed com-
munity search. We present the experimental results in Section 7
and conclude the paper in Section 8.
2 RELATED WORK
Our study is closely related to community search (CS) and graph
neural network (GNN).
Community Search
. The problem of CS [
37
] is to nd densely
connected communities containing the query vertices. A compre-
hensive survey of CS models and existing approaches can be found
in [
11
,
25
]. Various community models have been proposed based on
dierent cohesive graph patterns, including
𝑘
-core [
8
,
37
],
𝑘
-truss
[
1
,
21
,
24
],
𝑘
-clique [
7
,
44
], and
𝑘
-edge connected component (
ECC
)
[
6
,
19
]. These pre-dened cohesive metrics are inexible and can
be too loose (e.g.,
𝑘
-core) or too tight (e.g.,
𝑘
-clique) to capture the
topology structure of communities. If the real-world communities
do not follow any of the above graph patterns, these models would
fail to discover the true communities. A learning-based model ICS-
GNN [
14
] has recently been proposed for interactive community
search. ICS-GNN rst nds a candidate subgraph starting from
query vertices, then learns the node embeddings through applying
GNN model on subgraph, and nally employs a BFS based algo-
rithm to select the
𝑘
-sized community with maximum GNN scores.
ICS-GNN does not support attributed community search as the
query only involves vertices but no attributes. It also needs to re-
train the entire model for each query, which is costly for this online
query problem.
For attributed community search,
ACQ
[
10
] and
ATC
[
22
] have
been proposed, which aim to discover communities that contain
query vertices and have similar attributes to the query attributes.
ACQ
is based on
𝑘
-core and nds communities with the maximum
of 13
免费下载
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文档的来源(墨天轮),文档链接,文档作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

关注
最新上传
暂无内容,敬请期待...
下载排行榜
Top250 周榜 月榜