ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2021, Vol. 58 ›› Issue (8): 1624-1641.doi: 10.7544/issn1000-1239.2021.20210317

所属专题: 2021人工智能前沿进展专题

• 人工智能 • 上一篇    下一篇

基于非递减时序随机游走的动态异质网络嵌入

郭佳雯1,2,白淇介1,2,林铸天1,宋春瑶1,2,袁晓洁1,2   

  1. 1(南开大学网络空间安全学院 天津 300350);2(天津市网络与数据安全技术重点实验室(南开大学) 天津 300350) (guojiawen@dbis.nankai.edu.cn)
  • 出版日期: 2021-08-01
  • 基金资助: 
    国家自然科学基金项目(61772289,U1836109,U1936205,U1936105,62077031);江苏省大数据安全与智能处理重点实验室开放基金项目(BDSIP1902)

Dynamic Heterogeneous Network Embedding Based on Non-Decreasing Temporal Random Walk

Guo Jiawen1,2, Bai Qijie1,2, Lin Zhutian1, Song Chunyao1,2, Yuan Xiaojie1,2   

  1. 1(College of Cyber Science, Nankai University, Tianjin 300350);2(Tianjin Key Laboratory of Network and Data Security Technology(Nankai University), Tianjin 300350)
  • Online: 2021-08-01
  • Supported by: 
    This work was supported by the National Natural Science Foundation of China (61772289, U1836109, U1936205, U1936105, 62077031) and the Open Fund Project of Jiangsu Key Laboratory of Big Data Security and Intelligent Processing (BDSIP1902).

摘要: 网络嵌入是将高维网络映射到低维向量空间的一种表示学习方法.目前,人们对动态同质网络嵌入和静态异质信息网络嵌入已经开展了一些研究,但动态异质网络上的嵌入研究仍然较少.如果直接应用静态网络嵌入或动态同质网络嵌入方法来解决动态异质网络嵌入问题,会由于忽略网络的动态或异质特性而导致严重的信息丢失.因此,提出一种基于时间和类别约束随机游走的动态异质网络嵌入方法TNDE.该方法引入类别约束,能够解决动态异质网络中由于异质特性带来的语义信息保留问题.不同于其他动态网络中的时序随机游走,该方法采用非递减的时间约束来增量式地进行随机游走,能够解决网络同时具备动态和异质特性而引入的强语义局部结构上的边时间戳一致的挑战,避免游走时出现时间戳陷入的问题.通过对实时变化的增量游走和嵌入学习,TNDE提供了一种高效的在线表示学习算法.在3个真实数据集上的实验结果表明:该方法在不同特性的网络中具有良好的通用性.与目前最先进方法相比,能够得到下游链路预测和节点分类任务中2.4%~92.7%的准确度提升,显著提高了嵌入质量,并在保证良好嵌入质量的前提下,缩短算法运行时间12.5%~99.91%.

关键词: 动态网络, 异质信息网络, 网络嵌入, 增量学习, 随机游走

Abstract: Network embedding is an important work as a representation learning method for mapping high-dimensional networks to low-dimensional vector spaces. Some researches have been carried out on dynamic homogeneous network embedding and static network embedding. But there are still fewer studies for embedding on dynamic heterogeneous information networks (DHINs). If we directly apply static network embedding methods or dynamic homogeneous network embedding methods to solve the DHIN embedding problem, it will lead to serious information loss due to ignoring the dynamic or heterogeneous properties of the network. Therefore, we propose a DHIN embedding method called TNDE, which is based on time- and category-constrained random walk. The method adopts category constraints to solve the problem of preserving semantic information in DHINs. Moreover, unlike the temporal random walk in other dynamic network embedding methods, TNDE uses non-decreasing temporal constraints to incrementally perform random walk. It can solve the problem that edges on local structures with strong semantics have the same timestamps due to the simultaneous existence of dynamic and heterogeneous properties in DHIN and avoid being trapped in the same timestamps during walking. TNDE provides an efficient online representation learning algorithm by adopting incremental walking and incremental representation learning for real-time changes. Experimental results on three real datasets show that TNDE has good generality in networks with different characteristics and significantly improves embedding quality, which outperforms state-of-the-art methods by 2.4%~92.7% in downstream link prediction and node classification tasks. Moreover, TNDE reduces the algorithm runtime by 12.5%~99.91% with good embedding quality.

Key words: dynamic network, heterogeneous information network, network embedding, incremental learning, random walk

中图分类号: