Distance-Based Outlier Detection on Uncertain Data
-
Graphical Abstract
-
Abstract
Outlier detection is one of the valuable techniques in many applications, such as network intrusion detection, event detection in wireless sensor network (WSN), and so on. This technique has been well studied on deterministic databases. However, it is a new task on emerging uncertain database. Using the new uncertain data model, many real applications, such as wireless sensor network, data integration, and data mining, can be better described. The feasibility of such applications can be further enhanced. In this paper, a new definition of outlier on uncertain data is defined. Based on it, some efficient filtering approaches for outlier detection are proposed, including a basic filtering approach, called b-RFA, and an improved filtering approach, called o-RFA. Moreover, a probability approach, called DPA, is proposed to efficiently detect outlier on uncertain database. The approach b-RFA utilizes the property of non-outlier to reduce the times of detection. Moreover, o-RFA improves b-RFA by mining and using the data distribution. Furthermore, DPA finds the recursion rule in probability computation and greatly improves the efficiency of single data detection. Finally, the experimental results show that the proposed approaches can efficiently prune the candidates and reduce the corresponding searching space, and improve the performance of query processing on uncertain data.
-
-