Abstract:
Nowadays, technological development in measuring molecular interactions has led to an increasing number of large-scale biological molecular networks. Identifying conserved and stable functional modules from such networks helps not only to disclose the function of inner components, but also to understand their relationships systematically in complex systems. As one of classical NP-complete problems, the sub-graph query problem is gaining research efforts in analyzing such behaviors from the fields of social networks, biological networks, and so on. Calculating node similarities and reducing the sizes of target graphs are two common means for improving query precisions and reducing computational complexity in the study of sub-graph algorithms. For the problem of querying sub-graphs among complex protein interaction networks, this paper presents a sub-graph query algorithm based on semi-Markov random walk model. A comprehensive similarity measurement based on semi-Markov random walk model is designed to integrate the similarities of nodes, structures and their neighbors. Meanwhile, an iterative procedure is applied to reduce the size of targeted graph by removing nodes in a cluster with lower similarities by calculating the global correspondence score. The experimental results on multiple real protein query networks demonstrate that the proposed algorithm improves its performance in both query precisions and computational complexity.