Codes and datasets for AAAI 2021 paper "Learning to Pre-train Graph Neural Networks"
This record contains the dataset and codes to reproduce the published paper. Please refer to the live GitHub repository (https://github.com/rootlu/L2P-GNN) for more information. A copy of the GitHub repository is archived here and downloadable from this page.
Graph neural networks (GNNs) have become the defacto standard for representation learning on graphs, which derive effective node representations by recursively aggregating information from graph neighborhoods. While GNNs can be trained from scratch, pre-training GNNs to learn transferable knowledge for downstream tasks has recently been demonstrated to improve the state of the art. However, conventional GNN pre-training methods follow a two-step paradigm: 1) pre-training on abundant unlabeled data and 2) fine-tuning on downstream labeled data, between which there exists a significant gap due to the divergence of optimization objectives in the two steps. In this paper, we conduct an analysis to show the divergence between pre-training and fine-tuning, and to alleviate such divergence, we propose L2P-GNN, a self-supervised pre-training strategy for GNNs. The key insight is that L2P-GNN attempts to learn how to fine-tune during the pre-training process in the form of transferable prior knowledge. To encode both local and global information into the prior, L2P-GNN is further designed with a dual adaptation mechanism at both node and graph levels. Finally, we conduct a systematic empirical study on the pre-training of various GNN models, using both a public collection of protein graphs and a new compilation of bibliographic graphs for pre-training. Experimental results show that L2P-GNN is capable of learning effective and transferable prior knowledge that yields powerful representations for downstream tasks. (Code and datasets are available at https://github.com/rootlu/L2P-GNN.)