2 files

Data and code for: Topic-aware Heterogeneous Graph Neural Network for Link Prediction

posted on 28.09.2022, 07:01 authored by XU, Siyong, YANG, Cheng, SHI, Chuan, Yuan FANGYuan FANG, GUO, Yuxin, YANG, Tianchi, ZHANG, Luhao, HU, Maodi

This record contains the data and code for CIKM 2021 paper “Topic-aware Heterogeneous Graph Neural Network for Link Prediction”. 

Heterogeneous graphs (HGs), consisting of multiple types of nodes and links, can characterize a variety of real-world complex systems. Recently, heterogeneous graph neural networks (HGNNs), as a powerful graph embedding method to aggregate heterogeneous structure and attribute information, has earned a lot of attention. Despite the ability of HGNNs in capturing rich semantics which reveal different aspects of nodes, they still stay at a coarse-grained level which simply exploits structural characteristics. In fact, rich unstructured text content of nodes also carries latent but more fine-grained semantics arising from multi-facet topic-aware factors, which fundamentally manifest why nodes of different types would connect and form a specific heterogeneous structure. However, little effort has been devoted to factorizing them.In this paper, we propose a Topic-aware Heterogeneous Graph Neural Network, named THGNN, to hierarchically mine topic-aware semantics for learning multi-facet node representations for link prediction in HGs. Specifically, our model mainly applies an alternating two-step aggregation mechanism including intra-metapath decomposition and inter-metapath mergence, which can distinctively aggregate rich heterogeneous information according to the inferential topic-aware factors and preserve hierarchical semantics. Furthermore, a topic prior guidance module is also designed to keep the quality of multi-facet topic-aware embeddings relying on the global knowledge from unstructured text content in HGs. It helps to simultaneously improve both performance and interpretability. Experimental results on three real-world HGs demonstrate that our proposed model can effectively outperform the state-of-the-art methods in the link prediction task, and show the potential interpretability of learnt multi-facet topic-aware representations.



Confidential or personally identifiable information

I confirm that the uploaded data has no confidential or personally identifiable information.