ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2014, Vol. 51 ›› Issue (10): 2178-2186.doi: 10.7544/issn1000-1239.2014.20130538

• 人工智能 • 上一篇    下一篇

一种蛋白质复合体模块度函数及其识别算法

郭茂祖,代启国,徐立秋,刘晓燕   

  1. (哈尔滨工业大学计算机科学与技术学院 哈尔滨 150001) (maozuguo@hit.edu.cn)
  • 出版日期: 2014-10-01
  • 基金资助: 
    国家自然科学基金项目(60975035,61273291);山西省回国留学人员科研资助项目(2012-008);中国民航大学省部级科研机构开放基金项目(CAAC-ITRB-201305)

On Protein Complexes Identifying Algorithm Based on the Novel Modularity Function

Guo Maozu, Dai Qiguo, Xu Liqiu, Liu Xiaoyan   

  1. (School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001)
  • Online: 2014-10-01

摘要: 蛋白质复合体对于研究细胞活动具有重要意义.随着新的生物实验技术的不断出现,产生了大量的蛋白质相互作用网络.通过对蛋白质相互作用网络进行聚类识别蛋白质复合体是当前研究热点.然而,目前大多数蛋白质复合体识别算法的性能不够理想.为此,提出了蛋白质复合体模块度函数(PQ),并在此基础上提出了基于蛋白质复合体模块度函数的模块合并(based on protein complexes modularity function for merging modules, BMM)算法.BMM算法首先识别网络中一些稠密子图作为初始模块,然后依据PQ函数对这些初始模块进行合并,最终得到了质量较高的蛋白质复合体.将识别出的复合体分别与2种已知的蛋白质复合体数据集进行比对,结果表明BMM算法具有很好的识别性能.此外,与其他最新的识别算法相比,BMM算法的识别准确率较高.

关键词: 蛋白质复合体, 蛋白质相互作用, 蛋白质复合体模块度函数, 初始模块, BMM算法

Abstract: Proteins often interact with each other to form complexes. It is very significant for understanding the activities in cell to carry out their biological functions. In recent years, with the rapid development of new biological experiment technologies, a large amount of protein-protein interaction (PPI) networks are generated. Identifying protein complexes by clustering proteins in PPI networks is hot spot in current bioinformatics research. Many clustering methods, which are mainly based on graph partition or the technologies of community detection in social network, have been proposed to recognize the protein complexes in PPI networks in last decade. However, the performances of most of previous developed detecting methods are not ideal. They cannot identify the overlapping complexes, but according to the biological study found, protein complexes are often overlapping. Therefore, in this paper, a protein complexes modularity function (Q function), namely PQ function, is proposed to identify the overlapping complexes from PPI networks. Based on PQ, a new algorithm for identifying protein complexes BMM (the algorithm based on protein complexes modularity function for merging modules). Firstly, BMM algorithm finds some dense sub-graphs as initial modules. Then, these initial modules are merged by maximizing the modularity function PQ. Finally, several high-quality protein complexes are found. Comparing these protein complexes with two known protein complexes datasets, the results suggest that the performance of BMM is excellent. In addition, compared with other latest algorithms, BMM is more accurate.

Key words: protein complex, protein-protein interaction (PPI), protein complexes modularity function, initial module, based on protein complexes modularity function for merging modules (BMM) algorithm

中图分类号: