TY - JOUR

T1 - LCS graph kernel based on Wasserstein distance in longest common subsequence metric space

AU - Huang, Jianming

AU - Fang, Zhongxi

AU - Kasai, Hiroyuki

N1 - Publisher Copyright:
© 2021 Elsevier B.V.

PY - 2021/12

Y1 - 2021/12

N2 - For graph learning tasks, many existing methods utilize a message-passing mechanism where vertex features are updated iteratively by aggregation of neighbor information. This strategy provides an efficient means for graph features extraction, but obtained features after many iterations might contain too much information from other vertices, and tend to be similar to each other. This makes their representations less expressive. Learning graphs using paths, on the other hand, can be less adversely affected by this problem because it does not involve all vertex neighbors. However, most of them can only compare paths with the same length, which might engender information loss. To resolve this difficulty, we propose a new Graph Kernel based on a Longest Common Subsequence (LCS) similarity. Moreover, we found that the widely-used R-convolution framework is unsuitable for path-based Graph Kernel because a huge number of comparisons between dissimilar paths might deteriorate graph distances calculation. Therefore, we propose a novel metric space by exploiting the proposed LCS-based similarity, and compute a new Wasserstein-based graph distance in this metric space, which emphasizes more the comparison between similar paths. Furthermore, to reduce the computational cost, we propose an adjacent point merging operation to sparsify point clouds in the metric space.

AB - For graph learning tasks, many existing methods utilize a message-passing mechanism where vertex features are updated iteratively by aggregation of neighbor information. This strategy provides an efficient means for graph features extraction, but obtained features after many iterations might contain too much information from other vertices, and tend to be similar to each other. This makes their representations less expressive. Learning graphs using paths, on the other hand, can be less adversely affected by this problem because it does not involve all vertex neighbors. However, most of them can only compare paths with the same length, which might engender information loss. To resolve this difficulty, we propose a new Graph Kernel based on a Longest Common Subsequence (LCS) similarity. Moreover, we found that the widely-used R-convolution framework is unsuitable for path-based Graph Kernel because a huge number of comparisons between dissimilar paths might deteriorate graph distances calculation. Therefore, we propose a novel metric space by exploiting the proposed LCS-based similarity, and compute a new Wasserstein-based graph distance in this metric space, which emphasizes more the comparison between similar paths. Furthermore, to reduce the computational cost, we propose an adjacent point merging operation to sparsify point clouds in the metric space.

KW - Graph classification

KW - Graph kernel

KW - Optimal transport

UR - http://www.scopus.com/inward/record.url?scp=85112809800&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85112809800&partnerID=8YFLogxK

U2 - 10.1016/j.sigpro.2021.108281

DO - 10.1016/j.sigpro.2021.108281

M3 - Article

AN - SCOPUS:85112809800

VL - 189

JO - Signal Processing

JF - Signal Processing

SN - 0165-1684

M1 - 108281

ER -