Abstract:
Many stream-oriented systems are inherently geographically distributed, so distributed processing is a very promising route towards a more effective and adaptive data stream processing model. Aggregation over data streams is an important class of continuous operators for distributed processing. Because aggregation queries need be continuously computed and the result need be continuously transmitted, significant communication overhead is incurred for this model. Unfortunately, the continual transmission of a large number of rapid data streams can be impractical or expensive. So a new approximately incremental aggregation technique is proposed with provable guarantees on the approximation error for reducing the overhead. A new structure called the VSB-tree is introduced, which can effectively incorporate and store aggregation of all child stations. The VSB-tree also can incrementally transmit change of aggregation value to father station. The theory analysis and experimental results show the feasibility and effectiveness of the algorithm.