ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (12): 2824-2833.doi: 10.7544/issn1000-1239.2015.20140801

• 人工智能 • 上一篇    下一篇



  1. (天津大学计算机科学与技术学院 天津 300072) (天津市认知计算与应用重点实验室 天津 300072) (
  • 出版日期: 2015-12-01
  • 基金资助: 

A Random Walk Based Iterative Weighted Algorithm for Sub-Graph Query

Zhang Xiaochi, Yu Hua, Gong Xiujun   

  1. (School of Computer Science and Technology, Tianjin University, Tianjin 300072) (Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin 300072)
  • Online: 2015-12-01

摘要: 作为经典的NP完全问题之一,子图查询算法近年来在社交网络、生物分子网络等复杂系统分析中引起研究人员的极大关注.结点相似性计算和目标图约简是子图查询算法中提高查询准确率和降低计算复杂性的2种常用手段.针对复杂生物分子网络之间的子图查询问题,提出了一种基于半Markov随机游走的迭代加权子图查询算法.在结点相似性计算中,设计了基于半Markov游走模型的集成结点本身相似性、结构相似性及邻居结点相似性的综合度量方法;同时,在目标图约简过程中,通过迭代递减目标图中低相似性结点,以降低目标图的规模.对多个真实蛋白质网络查询的实验结果表明,算法在精度和时间复杂性方面都有明显提高.

关键词: 半Markov游走模型, 迭代加权, 蛋白质相互作用, 生物分子网络, 子图查询

Abstract: Nowadays, technological development in measuring molecular interactions has led to an increasing number of large-scale biological molecular networks. Identifying conserved and stable functional modules from such networks helps not only to disclose the function of inner components, but also to understand their relationships systematically in complex systems. As one of classical NP-complete problems, the sub-graph query problem is gaining research efforts in analyzing such behaviors from the fields of social networks, biological networks, and so on. Calculating node similarities and reducing the sizes of target graphs are two common means for improving query precisions and reducing computational complexity in the study of sub-graph algorithms. For the problem of querying sub-graphs among complex protein interaction networks, this paper presents a sub-graph query algorithm based on semi-Markov random walk model. A comprehensive similarity measurement based on semi-Markov random walk model is designed to integrate the similarities of nodes, structures and their neighbors. Meanwhile, an iterative procedure is applied to reduce the size of targeted graph by removing nodes in a cluster with lower similarities by calculating the global correspondence score. The experimental results on multiple real protein query networks demonstrate that the proposed algorithm improves its performance in both query precisions and computational complexity.

Key words: semi-Markov random walk model, iterative weighting, protein-protein interaction, biological molecule network, sub-graph query