ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2015, Vol. 52 ›› Issue (7): 1510-1521.doi: 10.7544/issn1000-1239.2015.20140308

• 人工智能 • 上一篇    下一篇

基于局部语义聚类的语义重叠社区发现算法

辛宇1,杨静1,汤楚蘅2, 葛斯乔2   

  1. 1(哈尔滨工程大学计算机科学与技术学院 哈尔滨 150001); 2(哈尔滨工业大学电气工程及自动化学院 哈尔滨 150001) (xinyu@hrbeu.edu.cn)
  • 出版日期: 2015-07-01
  • 基金资助: 
    基金项目:国家自然科学基金项目(61370083,61370086,61073043,61073041);教育部高等学校博士学科点专项科研基金项目(20112304110011,20122304110012)

An Overlapping Semantic Community Detection Algorithm Based on Local Semantic Cluster

Xin Yu1, Yang Jing1, Tang Chuheng2, Ge Siqiao2   

  1. 1(College of Computer Science and Technology, Harbin Engineering University, Harbin 150001);2(School of Electrical Engineering and Automation, Harbin Institute of Technology, Harbin 150001)
  • Online: 2015-07-01

摘要: 语义社会网络是一种包含信息节点及社会关系构成的新型复杂网络,因此以节点邻接关系为挖掘对象的传统社会网络社区发现算法无法有效处理语义社会网络重叠社区发现问题.针对这一问题,提出基于局部语义聚类的语义社会网络重叠社区发现算法,该算法:1)以LDA(latent Dirichlet allocation)模型为语义信息模型,利用Gibbs取样法建立节点语义信息到语义空间的量化映射;2)以节点间语义坐标的相对熵作为节点语义相似度的度量,建立节点相似度矩阵;3)根据社会网络的局部小世界特性,提出语义社会网络的局部社区结构S-fitness模型,并根据S-fitness模型建立了局部语义聚类算法(local semantic clusterm, LSC);4)提出可度量语义社区发现结果的语义模块度模型,并通过实验分析,验证了算法及语义模块度模型的有效性及可行性.

关键词: 语义社会网络, 重叠社区发现, LDA模型, 相对熵, Gibbs取样, 局部语义聚类

Abstract: Since the semantic social network (SSN) is a new kind of complex networks, the traditional community detection algorithms depending on the adjacency in social network are not efficient in the SSN. To solve this problem, an overlapping community structure detecting method on semantic social networks is proposed based on the local semantic cluster (LSC). Firstly, the algorithm utilizes the Gibbs sampling method to establish the quantization mapping by which the semantic information in nodes is changed into the semantic space, with the latent Dirichlet allocation (LDA) as the semantic model; Secondly, the algorithm establishes the similarity matrix of SSN, with the relative entropy of semantic coordinate as the measurement of similarity between nodes; Thirdly, according to the character of local small-world in social network, the algorithm proposes the S-fitness model which is the local community structure of SSN, and establishes the LSC method by the S-fitness model; Finally, the algorithm proposes the semantic model by which the community structure of SSN is measured, and the efficiency and feasibility of the algorithm and the semantic modularity are verified by experimental analysis.

Key words: semantic social network (SSN), overlapping community structure detection, latent Dirichlet allocation (LDA), relative entropy, Gibbs sampling, local semantic cluster (LSC)

中图分类号: