Abstract:
With the accumulation of the network graph data coupled with node attributes, the relations between node attributes and node linkages become more and more complex, which brings a lot of challenges to the task of the link prediction in complex network. The main reason is the inconsistency existing in the different source data, that is, the relations between the latent linkages which are implied by the node attributes and the observed linkages from network topological structure, respectively. This phenomenon directly affects the correctness and accuracy of link predictions. In order to effectively deal with multi-source data inconsistency and fuse the heterogeneous data, with the idea of granular computing and data multi-layer granular representation, we model the original data at different levels of granular representation. According to the data granular representation, we ultimately eliminate data inherent inconsistencies by finding the optimal granular structure. In this paper, we firstly define the data granular representation and the relation between different level granular; Then, we construct a log-likelihood model of the data, and place a lot of constraints decided by the granular relations to regularize the model; At last, we use the trained model to perform the link probability between nodes. Experiments show that, multi-source data can ultimately reduce the inconsistency by granular representation, and the statistic model regulated by these granular relations outperforms the state-of-the-art methods, and effectively improves the accuracy of the link prediction in the attributed graph.