分布式数据流增量聚集

王永利; 徐宏炳; 董逸生; 钱江波; 刘学军

分布式数据流增量聚集

1(东南大学计算机科学与技术系南京 210096) 2(佳木斯大学公共计算机教研部佳木斯 154007) (wyl_seu@126.com)

计量
- 文章访问数: 702
- HTML全文浏览量: 0
- PDF下载量: 579
出版历程
- 发布日期: 2006-03-14

Algorithms for Incremental Aggregation over Distributed Data Stream

1(Department of Computer Science and Technology, Southeast University, Nanjing 210096) 2(Department of Common Computer Teaching, Jiamusi University, Jiamusi 154007)

摘要

摘要: 分布式处理是数据流管理中的主流技术，聚集是分布式数据流系统中一种重要的连续查询类型.在分布式数据流环境中，由于需要连续计算聚集值，并且在分布式网络中连续传送聚集值，导致系统的通信开销非常大.为了有效地减少网络中数据流的传输量，提出了一种近似增量聚集算法(approximately incremental aggregate over distributed data stream，AIADDS).算法增量地计算网络中各个站点的聚集值，只有当聚集值的改变超出给定的阈值才向其他站点传送聚集改变量，这样，可以显著地降低网络的数据传输量.作为算法核心的VSB-Tree能够有效地合并、存储来自孩子站点的聚集值，同时增量地向它的父站点传送聚集改变量.理论分析和实验结果表明，算法是行之有效的.
- 数据流 /
- 增量聚集查询 /
- 分布式系统 /
- VSB-树
Abstract: Many stream-oriented systems are inherently geographically distributed, so distributed processing is a very promising route towards a more effective and adaptive data stream processing model. Aggregation over data streams is an important class of continuous operators for distributed processing. Because aggregation queries need be continuously computed and the result need be continuously transmitted, significant communication overhead is incurred for this model. Unfortunately, the continual transmission of a large number of rapid data streams can be impractical or expensive. So a new approximately incremental aggregation technique is proposed with provable guarantees on the approximation error for reducing the overhead. A new structure called the VSB-tree is introduced, which can effectively incorporate and store aggregation of all child stations. The VSB-tree also can incrementally transmit change of aggregation value to father station. The theory analysis and experimental results show the feasibility and effectiveness of the algorithm.
- data stream /
- incremental aggregation query /
- distributed system /
- VSB-tree