ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (2): 499-511.doi: 10.7544/issn1000-1239.2015.20131246

• 信息处理 • 上一篇    下一篇

基于随机游走的语义重叠社区发现算法

辛宇1,杨静1,谢志强2   

  1. 1(哈尔滨工程大学计算机科学与技术学院 哈尔滨 150001); 2(哈尔滨理工大学计算机科学与技术学院 哈尔滨 150080) (yangjing@hrbeu.edu.cn)
  • 出版日期: 2015-02-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(61370083,61370086,61073043,61073041);教育部高等学校博士学科点专项科研基金项目(20112304110011,20122304110012)

A Semantic Overlapping Community Detecting Algorithm in Social Networks Based on Random Walk

Xin Yu1,Yang Jing1, Xie Zhiqiang2   

  1. 1(College of Computer Science and Technology, Harbin Engineering University, Harbin 150001); 2(College of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080)
  • Online: 2015-02-01

摘要: 语义社会网络是由信息节点及社会关系构成的一类新型复杂网络,因此语义社会网络重叠社区发现是传统社区发现研究的新方向.针对这一问题,提出基于随机游走的语义社会网络重叠社区发现算法,该算法首先以LDA(latent Dirichlet allocation)算法为基础建立语义空间,实现节点语义信息到语义空间的量化映射;其次,以语义空间中节点信息熵作为节点语义信息比重,以节点的度分布比率作为节点关系比重,建立节点语义影响力模型及语义社会网络的加权邻接矩阵;再次,以语义影响力模型和加权邻接矩阵为参数,提出一种改进的语义社会网络重叠社区发现的随机游走策略,并提出可度量语义社区发现结果的语义模块度模型;最后,通过实验分析,验证了所提出的算法及语义模块度模型的有效性和可行性.

关键词: 随机游走, 社区发现, 语义社会网络, LDA算法, 语义模块度

Abstract: Since the semantic social networks (SSN) is a new kind of complex networks, the community detection is a new investigation relevant to the traditional community detection research. To solve this problem, an overlapping community structure detecting method in semantic social network is proposed based on the random walk strategy. The algorithm establishes the semantic space using latent Dirichlet allocation (LDA) method. Firstly, the quantization mapping is completed by which semantic information in nodes can be changed into the semantic space. Secondly, the semantic influence model and weighed adjacent matrix of SSN are established, with the entropy of nodes in SSN as the semantic information proportion, the distribution ratio of nodes as the weight of adjacent. Thirdly, an improved random walk strategy of community structure detecting in overlapping-SSN is proposed, with the distribution ratio of nodes as parameter, and a semantic modularity model is proposed by which the community structure of SSN can be measured. Finally, the efficiency and feasibility of the proposed algorithm and the semantic modularity are verified by experimental analysis.

Key words: random walk, community detection, semantic social network, latent Dirichlet allocation, semantic modularity

中图分类号: