SMU Research Data Repository (RDR)
Browse
ARCHIVE
THGNN-main.zip (9.25 MB)
ARCHIVE
DBLP_preprocess.zip (559.14 MB)
1/0
2 files

Data and code for: Topic-aware Heterogeneous Graph Neural Network for Link Prediction

dataset
posted on 2022-09-28, 07:01 authored by XU, Siyong, YANG, Cheng, SHI, Chuan, Yuan FANGYuan FANG, GUO, Yuxin, YANG, Tianchi, ZHANG, Luhao, HU, Maodi

This record contains the data and code for CIKM 2021 paper “Topic-aware Heterogeneous Graph Neural Network for Link Prediction”. 


Heterogeneous graphs (HGs), consisting of multiple types of nodes and links, can characterize a variety of real-world complex systems. Recently, heterogeneous graph neural networks (HGNNs), as a powerful graph embedding method to aggregate heterogeneous structure and attribute information, has earned a lot of attention. Despite the ability of HGNNs in capturing rich semantics which reveal different aspects of nodes, they still stay at a coarse-grained level which simply exploits structural characteristics. In fact, rich unstructured text content of nodes also carries latent but more fine-grained semantics arising from multi-facet topic-aware factors, which fundamentally manifest why nodes of different types would connect and form a specific heterogeneous structure. However, little effort has been devoted to factorizing them.In this paper, we propose a Topic-aware Heterogeneous Graph Neural Network, named THGNN, to hierarchically mine topic-aware semantics for learning multi-facet node representations for link prediction in HGs. Specifically, our model mainly applies an alternating two-step aggregation mechanism including intra-metapath decomposition and inter-metapath mergence, which can distinctively aggregate rich heterogeneous information according to the inferential topic-aware factors and preserve hierarchical semantics. Furthermore, a topic prior guidance module is also designed to keep the quality of multi-facet topic-aware embeddings relying on the global knowledge from unstructured text content in HGs. It helps to simultaneously improve both performance and interpretability. Experimental results on three real-world HGs demonstrate that our proposed model can effectively outperform the state-of-the-art methods in the link prediction task, and show the potential interpretability of learnt multi-facet topic-aware representations.

 

History

Confidential or personally identifiable information

  • I confirm that the uploaded data has no confidential or personally identifiable information.

Usage metrics

    School of Computing and Information Systems

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC