高级检索
    罗晟, 苗夺谦, 张志飞, 张远健, 胡声丹. 基于层次信息粒表示的属性图链接预测模型[J]. 计算机研究与发展, 2019, 56(3): 623-634. DOI: 10.7544/issn1000-1239.2019.20170961
    引用本文: 罗晟, 苗夺谦, 张志飞, 张远健, 胡声丹. 基于层次信息粒表示的属性图链接预测模型[J]. 计算机研究与发展, 2019, 56(3): 623-634. DOI: 10.7544/issn1000-1239.2019.20170961
    Luo Sheng, Miao Duoqian, Zhang Zhifei, Zhang Yuanjian, Hu Shengdan. A Link Prediction Model Based on Hierarchical Information Granular Representation for Attributed Graphs[J]. Journal of Computer Research and Development, 2019, 56(3): 623-634. DOI: 10.7544/issn1000-1239.2019.20170961
    Citation: Luo Sheng, Miao Duoqian, Zhang Zhifei, Zhang Yuanjian, Hu Shengdan. A Link Prediction Model Based on Hierarchical Information Granular Representation for Attributed Graphs[J]. Journal of Computer Research and Development, 2019, 56(3): 623-634. DOI: 10.7544/issn1000-1239.2019.20170961

    基于层次信息粒表示的属性图链接预测模型

    A Link Prediction Model Based on Hierarchical Information Granular Representation for Attributed Graphs

    • 摘要: 随着具有结点属性信息的网络图数据的增加,结点属性及结点链接关系越来越复杂,这对复杂网络的链接预测任务带来了一系列的挑战.这些不同来源的原始数据之间存在着不一致性,即结点的属性诱导的潜在链接关系与网络拓扑结构观测到的链接边之间存在着不一致的情况,这一现象将直接影响结点对之间的链接预测准确性与精确性.为了有效处理多源数据的不一致性,融合异构数据的差异,借助粒计算思想,通过对原始数据的多粒度表示,将原始数据在不同层次的粒度进行信息表示建模.最终依据这些数据的粒度表示,寻找最优的粒层结构,并最大化地消除数据内在的不一致性.首先,定义了数据的粒度不同层次表示及粒层关系;其次,对所观测到的链接数据,构建对数似然统计模型,并综合不同粒度层数据特点对模型进行修正;最后,使用多源数据训练统计模型,将学习好的模型用于预测结点对之间的链接概率.实验表明:与现有链接预测模型相比,多源数据经过粒度表示极大地平衡了多源数据的不一致性,有效提升了链接预测任务的准确性.

       

      Abstract: With the accumulation of the network graph data coupled with node attributes, the relations between node attributes and node linkages become more and more complex, which brings a lot of challenges to the task of the link prediction in complex network. The main reason is the inconsistency existing in the different source data, that is, the relations between the latent linkages which are implied by the node attributes and the observed linkages from network topological structure, respectively. This phenomenon directly affects the correctness and accuracy of link predictions. In order to effectively deal with multi-source data inconsistency and fuse the heterogeneous data, with the idea of granular computing and data multi-layer granular representation, we model the original data at different levels of granular representation. According to the data granular representation, we ultimately eliminate data inherent inconsistencies by finding the optimal granular structure. In this paper, we firstly define the data granular representation and the relation between different level granular; Then, we construct a log-likelihood model of the data, and place a lot of constraints decided by the granular relations to regularize the model; At last, we use the trained model to perform the link probability between nodes. Experiments show that, multi-source data can ultimately reduce the inconsistency by granular representation, and the statistic model regulated by these granular relations outperforms the state-of-the-art methods, and effectively improves the accuracy of the link prediction in the attributed graph.

       

    /

    返回文章
    返回