高级检索

    数据流上连续动态skyline查询研究

    Continuous Dynamic Skyline Queries over Data Stream

    • 摘要: skyline查询能够从大规模数据集上计算满足多个标准的最优点.数据流上的skyline计算是数据流上最基本的查询操作之一,对于很多在线应用具有非常重要的意义,尤其在移动计算环境、网络监控、通信网络以及传感器网络等领域.不同于大部分传统的skyline研究,主要研究数据流上约束skyline和动态skyline计算问题.采用网格索引存储元组,提出了GBDS算法用于计算和维护动态skyline.通过为每个查询定义影响区域,使得在元组到达和失效时需要处理的元组个数最小化.理论分析和实验结果证明了提出方法的有效性.

       

      Abstract: Skyline queries are capable of retrieving interesting points from a large data set according to multiple criteria. As an essential query, skyline computation over data stream is very important for many online applications, including mobile environment, network monitoring, communication, sensor network and stock market trading, etc. The problem of skyline computation has attracted considerable research attention. Different from most popular skyline processing methods, this paper focuses on constrained skyline and dynamic skyline processing over data stream. Instead of computing the skyline results on the whole data set, this kind of skyline query only needs to process parts of the data set, and there are maybe thousands of such queries in the system. To deal with the challenges of the random additions and deletions of the tuples over data stream, we employ a grid based index to store the tuples and put forward an algorithm to compute and maintain skyline set based on it. By making use of the advantage of grid index, we define influence area for every query to minimize the cells need to be processed when new tuples arrive and old tuples expire. Only tuples in the cells that belong to influence area will be processed. This way, the tuples which are not in the influence area will be ignored and the CPU time is saved. Theoretical analysis and experimental evidences show the efficiency of the proposed approaches.

       

    /

    返回文章
    返回