Abstract:
Overlapping community detection has become one of the hottest research issues in the field of network science, as well as attracted the attention of researchers with different backgrounds. A novel semi-supervised local expansion method (SLEM) is proposed for detecting overlapping communities more effectively in real world networks. The proposed method makes use of not only the topology information of the network but also the attribute information of partial vertices. Inspired by the idea of semi-supervised clustering with constraints in the field of machine learning, SLEM starts from utilizing the attribute information of partial vertices to get pairwise constraints which can be used to modify the topology structure of the original network. Afterward, a vertex degree centrality-based seeding method is proposed for selecting seeds as initial communities. Then these seeds expand into local communities by a greedy strategy, after which partial connected close-knit communities are formed. Finally, similarities between different communities are computed on the basis of a community distance measurement, and then near-duplicated communities are combined. Taking more advantage of network information than traditional unsupervised community detection methods, SLEM can produce communities with higher structure quality. Experimental results on both synthetic benchmark networks and real world networks show that SLEM can achieve better effect than the state-of-the-art local expansion methods on the networks of different sparsity degrees.