ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2017, Vol. 54 ›› Issue (8): 1795-1803.doi: 10.7544/issn1000-1239.2017.20170172

Special Issue: 2017人工智能前沿进展专题

Previous Articles     Next Articles

Link Prediction Method Based on Clustering and Decision Tree

Yang Niya1, Peng Tao1,2, Liu Lu1   

  1. 1(College of Computer Science and Technology, Jilin University, Changchun 130012);2(Key Laboratory of Symbol Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun 130012)
  • Online:2017-08-01

Abstract: Link prediction is one of the primal problems in data mining. Due to the network complexity and the data diversity, the problem of link prediction for different types of data in heterogeneous networks has become more and more complicated. Aiming at link prediction in bi-typed heterogeneous information network, this paper proposes a link prediction method based on clustering and decision tree, called CDTLinks. One kind of objects is considered as the features of the other kind of objects. Then, they are clustered separately. Three heuristic rules are proposed to construct decision trees for bi-typed heterogeneous networks. The branch of the tree with the highest information gain is selected. Finally, we can judge whether there is a link between two nodes through the clustering result and the decision tree model. In addition, we define the concept of potential link nodes and introduce the number of layers, which can reduce the running time and improve the accuracy. The proposed CDTlinks method is validated on DBLP and AMiner datasets. The experimental results show that the CDTlinks model can be used to conduct link prediction effectively in bi-typed heterogeneous networks.

Key words: link prediction, clustering, decision tree, heterogeneous information network, heuristic rules

CLC Number: