高级检索

    滑动窗口数据流上多极值查询资源共享策略研究

    Resource Sharing in Continuous Extreme Values Monitoring on Sliding Windows

    • 摘要: 为了提高在同一数据流上同时计算多个连续极值查询(MAX或MIN)时的处理能力,对查询间资源共享技术进行了研究.提出了一种称为“关键点集”的裁剪策略,系统仅需保存少量数据即可满足所有查询的需要.发掘多个查询间的相似性和可共享的计算存储资源,提出了一个多极值查询处理算法MCEQP.采用链表结构实现的该算法,当一个新数据到达时最多需要O(M+K)时间即可更新全部K个查询的结果,其中M为关键点集包含数据的个数. MCEQP采用触发器驱动的方式,只在某些特定时刻才需要计算因数据失效引起的查询结果变化,更新K个查询结果所需时间为O(K).理论分析和实验证明,对于滑动窗口数据流上的多个极值查询,MCEQP算法在降低存储开销和提高性能方面均优于现有的通用方法.

       

      Abstract: The problem of resource sharing in continuous extreme values monitoring (MAX or MIN) over sliding windows is considered. Firstly, an effective pruning technique called key points (KP) is developed to minimize the number of elements to be kept for all queries. It can be shown that on average the cardinality of KP satisfies M=O(logN), where N is the number of points contained in the widest window. Then an efficient algorithm called MCEQP (multi-continuous extreme queries processing algorithm) is proposed for continuously monitoring K queries with different sliding window widths. The main idea of MCEQP is to handle queries collectively by exploiting similarities and sharing resources such as computation and memory, which is more efficient than handling them separetely. The linklist-implemented instance of MCEQP can update all K results in O(M+K) time when a new tuple arrives, where M is the cardinality of KP. A trigger based technique is provided to avoid frequent but unnecessary process for data expirations, and only on some special time instances, O(K) time is needed to update K results. The dynamic registration and removal of queries are also supported by MCEQP in O(M+K) time. Theoretical analysis and experimental evidences show the efficiency of the proposed approach both on storage reduction and efficiency improvement.

       

    /

    返回文章
    返回