Self-Supervised Graph Topology-Imbalance Learning Based on Random Walk Paths

Qin Zheyun; Lu Xiankai; Xi Xiaoming; Ren Chunxiao; Nie Xiushan; Yin Yilong

doi:10.7544/issn1000-1239.202330813

Journal of Computer Research and Development > 2025 > 62(4): 863-875. > DOI: 10.7544/issn1000-1239.202330813 CSTR: 32373.14.issn1000-1239.202330813

Qin Zheyun, Lu Xiankai, Xi Xiaoming, Ren Chunxiao, Nie Xiushan, Yin Yilong. Self-Supervised Graph Topology-Imbalance Learning Based on Random Walk Paths[J]. Journal of Computer Research and Development, 2025, 62(4): 863-875. DOI: 10.7544/issn1000-1239.202330813

Citation:

PDF (1091 KB)

Self-Supervised Graph Topology-Imbalance Learning Based on Random Walk Paths

1.
School of Software, Shandong University, Jinan 250101
2.
School of Computer Science and Technology, Shandong Jianzhu University, Jinan 250101
3.
Shandong International Talent Exchange Service Center, Jinan 250101

Funds: This work was supported by the Key Program of the National Natural Science Foundation of China (U23A20389).

More Information

Author Bio:
Qin Zheyun: born in 1997. PhD candidate. His main research interest includes machine learning

Lu Xiankai: born in 1990. PhD, professor, master supervisor. His main research interests include machine learning and computer version

Xi Xiaoming: born in 1987. PhD, professor, master supervisor. His main research interests include pattern recognition, computer vision, and medical image processing

Ren Chunxiao: born in 1978. PhD, professor. His main research interests include pattern recognition, and machine learning

Nie Xiushan: born in 1981. PhD, professor, PhD supervisor. Senior member of CCF. His main research interests include artificial intelligence and intelligent media analysis, and computer application technology

Yin Yilong: born in 1972. PhD, professor, PhD supervisor. Distinguished member of CCF. His main research interest includes machine learning
Received Date: October 10, 2023
Revised Date: October 30, 2024
Accepted Date: November 12, 2024
Available Online: November 18, 2024

Graphical Abstract

Abstract

Abstract

The problem of topological imbalance in graphs, arising from the non-uniform and asymmetric distribution of nodes in the topological space, significantly hampers the performance of graph neural networks. Current research predominantly focuses on labeled nodes, with relatively less attention given to unlabeled nodes. To address this challenge, we propose a self-supervised learning method based on random walk paths aimed at tackling the issues posed by topological imbalance, including the constraints imposed by homogeneity assumptions, topological distance decay, and annotation attenuation. Our method introduces the concept of multi-hop paths within the subgraph neighborhood, aiming to comprehensively capture relationships and local features among nodes. Firstly, through a strategy of aggregating between paths, we can learn both homogeneous and heterogeneous features within multi-hop paths, thereby not only preserving the nodes’ original attributes but also maintaining their initial structural connections in the random walk sequences. Additionally, by combining a strategy of aggregating subgraph samples based on multiple paths with structured contrastive loss, we maximize the intrinsic features of local subgraphs for the same node, enhancing the expressive power of graph representations. Experimental results validate the effectiveness and generalization performance of our method across various imbalanced scenarios. This research provides a novel approach and perspective for addressing topological imbalance issues.

FullText(HTML)

References (55)

References

[1]	康昭,刘亮,韩蒙. 基于转换学习的半监督分类[J]. 计算机研究与发展,2023,60(1):103−111 doi: 10.7544/issn1000-1239.202110811 Kang Zhao, Liu Liang, Han Meng. Semi-supervised classification based on transformed learning[J]. Journal of Computer Research and Development, 2023, 60(1): 103−111 (in Chinese) doi: 10.7544/issn1000-1239.202110811
[2]	Hamilton W L, Ying R, Leskovec J. Representation learning on graphs: Methods and applications[J]. arXiv preprint, arXiv: 1709.05584, 2017
[3]	Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[J]. arXiv preprint, arXiv: 1609.02907, 2016
[4]	Chen Deli, Lin Yankai, Zhao Guangxiang, et al. Topology-imbalance learning for semi-supervised node classification[C]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2021: 29885−29897
[5]	Sun Qingyun, Li Jianxin, Yuan Haonan, et al. Position-aware structure learning for graph topology-imbalance by relieving under-reaching and over-squashing[C]//Proc of the 31st ACM Int Conf on Information & Knowledge Management. New York: ACM, 2022: 1848−1857
[6]	Fu Xingcheng, Wei Yuecen, Sun Qingyun, et al. Hyperbolic geometric graph representation learning for hierarchy-imbalance node classification[C]//Proc of the ACM Web Conf. New York: ACM, 2023: 460−468
[7]	Jin Di, Wang Rui, Ge Meng, et al. RAW-GNN: Random walk aggregation based graph neural network[J]. arXiv preprint, arXiv: 2206.13953, 2022
[8]	Han Yuehui, Hui Le, Jiang Haobo, et al. Generative subgraph contrast for self-supervised graph representation learning[C]//Proc of the 17th European Conf on Computer Vision. Berlin: Springer, 2022: 91−107
[9]	Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering[C/OL]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2016[2024-09-29]. https://proceedings.neurips.cc/paper_files/paper/2016/hash/04df4d434d481c5bb723be1b6df1ee65-Abstract.html
[10]	Veličković P, Cucurull G, Casanova, et al. Graph attention networks[J] arXiv preprint, arXiv: 1710.10903, 2017
[11]	Yun S, Jeong M, Kim R, et al. Graph transformer networks[C/OL]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2019[2024-09-29]. https://proceedings.neurips.cc/paper/2019/hash/9d63484abb477c97640154d40595a3bb-Abstract.html
[12]	Hamilton W, Ying Zhitao, Leskovec J. Inductive representation learning on large graphs[C/OL]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2017[2024-09-29]. https://proceedings.neurips.cc/paper/2017/hash/5dd9db5e033da9c6fb5ba83c7a7ebea9-Abstract.html
[13]	Fu Xinyu, Zhang Jiani, Meng Ziqiao, et al. Ma-GNN: Metapath aggregated graph neural network for heterogeneous graph embedding[C]//Proc of the Web Conf. New York: ACM, 2020: 2331−2341
[14]	谢小杰,梁英,王梓森,等. 基于图卷积的异质网络节点分类方法[J]. 计算机研究与发展,2022,59(7):1470−1485 doi: 10.7544/issn1000-1239.20210124 Xie Xiaojie, Liang Ying, Wang Zisen, et al. Heterogeneous network node classification method based on graph convolution[J]. Journal of Computer Research and Development, 2022, 59(7): 1470−1485 (in Chinese) doi: 10.7544/issn1000-1239.20210124
[15]	Dong Yuxiao, Chawla N V, Swami A. Metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proc of the 23rd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2017: 135−144
[16]	Fu Taoyang, Lee W C, Lei Zhen. HIN2vec: Explore meta-paths in heterogeneous information networks for representation learning[C]//Proc of the ACM on Conf on Information and Knowledge Management. New York: ACM, 2017: 1797−1806
[17]	Shi Chuan, Hu Binbin, Zhao W X, et al. Heterogeneous information network embedding for recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(2): 357−370
[18]	Wang Xiao, Ji Houye, Shi Chuan, et al. Heterogeneous graph attention network[C]//Proc of the World Wide Web Conf. New York: ACM, 2019: 2022−2032
[19]	Wang Yiwei, Wang Wei, Liang Yuxuan, et al. Mixup for node and graph classification[C]//Proc of the Web Conf. New York: ACM, 2021: 3663−3674
[20]	Feng Wenzheng, Zhang Jie, Dong Yuxiao, et al. Graph random neural networks for semi-supervised learning on graphs[C]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2020: 22092−22103
[21]	You Yuning, Chen Tianlong, Sui Yongduo, et al. Graph contrastive learning with augmentations[C]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2020: 5812−5823
[22]	Zhu Yun, Guo Jianhao, Wu Fei, et al. Rosa: A robust self-aligned framework for node-node graph contrastive learning[J]. arXiv preprint, arXiv: 2204.13846, 2022
[23]	Gasteiger J, Weißenberger S, Günnemann S. Diffusion improves graph learning[C/OL]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2019[2024-09-29]. https://proceedings.neurips.cc/paper/2019/hash/23c894276a2c5a16470e6a31f4618d73-Abstract.html
[24]	Zhao Tong, Liu Yozen, Neves L, et al. Data augmentation for graph neural networks[C]//Proc of the 35th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2021: 11015−11023
[25]	He Haibo, Garcia E A. Learning from imbalanced data[J]. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9): 1263−1284 doi: 10.1109/TKDE.2008.239
[26]	Guo Haixiang, Li Yijing, Shang J, et al. Learning from class-imbalanced data: Review of methods and applications[J]. Expert Systems with Applications, 2017, 73: 220−239
[27]	Xu Xiaolong, Chen Wen, Sun Yanfei. Over-sampling algorithm for imbalanced data classification[J]. Journal of Systems Engineering and Electronics, 2019, 30(6): 1182−1191 doi: 10.21629/JSEE.2019.06.12
[28]	Park J, Song J, Yang E. Graphens: Neighbor-aware ego network synthesis for class-imbalanced node classification[C/OL]//Proc of the 10th Int Conf on Learning Representations. Washington: ICLR, 2021[2024-09-29]. https://openreview.net/forum?id=MXEl7i-iru
[29]	Cao Kaidi, Wei C, Gaidon A, et al. Learning imbalanced datasets with label-distribution-aware margin loss[C/OL]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2019[2024-09-29]. https://proceedings.neurips.cc/paper/2019/hash/621461af90cadfdaf0e8d4cc25129f91-Abstract.html
[30]	Zhu Yanqiao, Xu Weizhi, Zhang Jinghao, et al. Deep graph structure learning for robust representations: A survey[J]. arXiv preprint, arXiv: 2103.03036, 2021
[31]	Wang Hongwei, Leskovec J. Combining graph convolutional neural networks and label propagation[J]. ACM Transactions on Information Systems, 2021, 40(4): 1−27
[32]	Topping J, Di Giovanni F, Chamberlain B P, et al. Understanding over-squashing and bottlenecks on graphs via curvature[J]. arXiv preprint, arXiv: 2111.14522, 2021
[33]	Zhao Jianan, Wang Xiao, Shi Chuan, et al. Heterogeneous graph structure learning for graph neural networks[C]//Proc of the 35th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2021: 4697−4705
[34]	Zhao Jianan, Wang Xiao, Shi Chuan, et al. Robust graph representation learning via neural sparsification[C]//Proc of the 37th Int Conf on Machine Learning. New York: ICML, 2020: 11458−11468
[35]	Grover A, Leskovec J. Node2vec: Scalable feature learning for networks[C]//Proc of the 22nd ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2016: 855−864
[36]	Dey R, Salem F M. Gate-variants of gated recurrent unit (GRU) neural networks[C]//Proc of the 60th IEEE Int Midwest Symp on Circuits and Systems. Piscataway, NJ: IEEE, 2017: 1597−1600
[37]	Yin Yihang, Wang Qingzhong, Huang Siyu, et al. Autogcl: Automated graph contrastive learning via learnable view generators[C]//Proc of the 36th AAAI Conf on Artificial Intelligence. Palo Alto, CA: AAAI, 2022: 8892−8900
[38]	Cuturi M. Sinkhorn distances: Lightspeed computation of optimal transport[C/OL]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2013[2024-09-29]. https://proceedings.neurips.cc/paper/2013/hash/af21d0c97db2e27e13572cbf59eb343d-Abstract.html
[39]	Chizat L, Roussillon P, Léger F, et al. Faster Wasserstein distance estimation with the Sinkhorn divergence[C]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2020: 2257−2269
[40]	Sen P, Namata G, Bilgic M, et al. Collective classification in network data[J]. AI Magazine, 2008, 29(3): 93−93
[41]	Zitnik M, Leskovec J. Predicting multicellular function through multi-layer tissue networks[J]. Bioinformatics, 2017, 33(14): i190−i198 doi: 10.1093/bioinformatics/btx252
[42]	Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations[C]//Proc of the 20th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701−710
[43]	Kipf T N, Welling M. Variational graph auto-encoders[J]. arXiv preprint, arXiv: 1611.07308, 2016
[44]	Zhu Yanqiao, Xu Yichen, Yu Feng, et al. Deep graph contrastive representation learning[J]. arXiv preprint, arXiv: 2006.04131, 2020
[45]	Veličković P, Fedus W, Hamilton W L, et al. Deep graph infomax[J]. arXiv preprint, arXiv: 1809.10341, 2018
[46]	Thakoor S, Tallec C, Azar M G, et al. Large-scale representation learning on graphs via bootstrapping[J]. arXiv preprint, arXiv: 2102.06514, 2021
[47]	Peng Zhen, Huang Wenbing, Luo Minnan, et al. Graph representation learning via graphical mutual information maximization[C]//Proc of the Web Conf. New York: ACM, 2020: 259−270
[48]	Hafidi H, Ghogho M, Ciblat P, et al. Graphcl: Contrastive self-supervised learning of graph representations[J]. arXiv preprint, arXiv: 2007.08025, 2020
[49]	Jiao Yizhu, Xiong Yun, Zhang Jiawei, et al. Sub-graph contrast for scalable self-supervised graph representation learning[C]//Proc of the 20th IEEE Int Conf on Data Mining. Piscataway, NJ: IEEE, 2020: 222−231
[50]	Xu Dongkuan, Wei Cheng, Luo Dongsheng, et al. Infogcl: Information-aware graph contrastive learning[C]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2021: 30414−30425
[51]	Zhang Hengrui, Wu Qitian, Yan Junchi, et al. From canonical correlation analysis to self-supervised graph neural networks[C]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2021: 76−89
[52]	Hou Zhenyu, Liu Xiao, Cen Yukuo, et al. Graphmae: Self-supervised masked graph autoencoders[C]//Proc of the 28th ACM SIGKDD Conf on Knowledge Discovery and Data Mining. New York: ACM, 2022: 594−604
[53]	Yu Chen, Wu Lingfei, Zaki M. Iterative deep graph learning for graph neural networks: Better and robust node embeddings[C]//Proc of the Advances in Neural Information Processing Systems. Cambridge, MA: MIT, 2020: 19314−19326
[54]	Wu Felix, Souza A, Zhang Tianyi, et al. Simplifying graph convolutional networks[C]//Proc of the 36th Int Conf on Machine Learning. New York: ICML, 2019: 6861−6871
[55]	Gasteiger J, Bojchevski A, Günnemann S. Predict then propagate: Graph neural networks meet personalized pagerank[J]. arXiv preprint, arXiv: 1810.05997, 2018

Cited By

Cited by

Periodical cited type(3)

1.	台建玮，杨双宁，王佳佳，李亚凯，刘奇旭，贾晓启. 大语言模型对抗性攻击与防御综述. 计算机研究与发展. 2025(03): 563-588 . 本站查看
2.	布文茹，王昊，李晓敏，周抒，邓三鸿. 古诗词中的探赜索隐：决策层融合大模型修正的典故引用识别方法. 科技情报研究. 2024(04): 37-52 .
3.	付志远，陈思宇，陈骏帆，海翔，石岩松，李晓琦，李益红，岳秋玲，张玉清. 大语言模型安全的挑战与机遇. 信息安全学报. 2024(05): 26-55 .