Abstract:
Protein-protein interaction (PPI) networks are widely present in complex biological networks. The topological features of PPI networks play an important role in analyzing the functional modules in networks. Some graph clustering methods have been successfully used to complex networks to detect protein complexes in PPI networks. Traditional graph clustering algorithms in PPI analyzing methods primarily focus on hard clustering for a network, while, nowadays soft clustering algorithms to find overlapped clusters have become one of the hotspots of current research. Existing soft clustering algorithms pay less attention on small-scale non-dense clusters, while some small-scale non-dense clusters often have important biological meaning in PPI networks. A measuring method of the association strength of edges is developed based on node neighborhoods in networks, and then a soft clustering algorithm named flow-simulation graph clustering (F-GCL) on the basis of flow simulation is presented to detect complexes in a PPI network. Experiments show that the proposed soft clustering algorithm F-GCL can simultaneously find out overlapping clusters and small-scale non-dense clusters without improving the running time. Compared with MCODE(molecular complex detection), MCL(Markov clustering), RNSC(restricted neighborhood search clustering) and CPM(clique percolation method) algorithms on six Saccharomyces cerevisiae PPI networks, the algorithm F-GCL shows considerable or better performance on three evaluating indicators: F-measure, Accuracy and Separation.