Distributed Aggregations for Two Queries over Uncertain Data
-
Graphical Abstract
-
Abstract
The technique of querying uncertain data is playing an increasingly important role in the fields of military affairs, finance, and telecom. Actually, a lot of uncertain data is generated in distributed systems such as wireless sensor networks, distributed Web servers and P2P systems. Collecting all the uncertain data from such a system to perform centralized queries will lead to immense communication cost, time delay and storage cost. Also, most centralized query methods are not applicable to distributed systems due to the features of uncertain data. In this paper, distributed uncertain Max (Min) and Top-k queries are defined, and designed distributed aggregation algorithms are designed based on several filtering strategies. The algorithms compute the upper bound of the candidate probability for each data entry using the uncertain ranges and probability distributions of the data, and filter the data entries as much as possible, which have no chance to influence the query result. Also, the algorithms merge and keep a relatively small amount of necessary data in the transmission, to compute the final results. The correctness of the filtering strategies has been proved. Simulations show that the algorithms can process the queries correctly and reduce communication cost efficiently with various data and system circumstances.
-
-