ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (8): 1729-1739.doi: 10.7544/issn1000-1239.2016.20160178

Special Issue: 2016数据挖掘前沿技术专题

Previous Articles     Next Articles

Tensor Representation Based Dynamic Outlier Detection Method in Heterogeneous Network

Liu Lu1, Zuo Wanli1,2,Peng Tao1,2   

  1. 1(College of Computer Science and Technology, Jilin University, Changchun 130012);2(Key Laboratory of Symbol Computation and Knowledge Engineering (Jilin University), Ministry of Education, Changchun 130012)
  • Online:2016-08-01

Abstract: Mining rich semantic information hidden in heterogeneous information network is an important task in data mining. The value, data distribution and generation mechanism of outliers are all different from that of normal data. It is of great significance of analyzing its generation mechanism or even eliminating outliers. Outlier detection in homogeneous information network has been studied and explored for a long time. However, few of them are aiming at dynamic outlier detection in heterogeneous networks. Many issues need to be settled. Due to the dynamics of the heterogeneous information network, normal data may become outliers over time. This paper proposes a dynamic tensor representation based outlier detection method, called TRBOutlier. It constructs tensor index tree according to the high order data represented by tensor. The features are added to direct item set and indirect item set respectively when searching the tensor index tree. Meanwhile, we describe a clustering method based on the correlation of short texts to judge whether the objects in datasets change their original clusters and then detect outliers dynamically. This model can keep the semantic relationship in heterogeneous networks as much as possible in the case of fully reducing the time and space complexity. The experimental results show that our proposed method can detect outliers dynamically in heterogeneous information network effectively and efficiently.

Key words: dynamic outlier detection, heterogeneous information network, tensor representation, tensor index tree, clustering

CLC Number: