高级检索
    张 慧, 郑吉平, 韩秋廷. BTreeU-Topk:基于二叉树的不确定数据上的Top-k查询算法[J]. 计算机研究与发展, 2012, 49(10): 2095-2105.
    引用本文: 张 慧, 郑吉平, 韩秋廷. BTreeU-Topk:基于二叉树的不确定数据上的Top-k查询算法[J]. 计算机研究与发展, 2012, 49(10): 2095-2105.
    Zhang Hui, Zheng Jiping, Han Qiuting. BTreeU-Topk: Binary-Tree Based Top-k Query Algorithms on Uncertain Data[J]. Journal of Computer Research and Development, 2012, 49(10): 2095-2105.
    Citation: Zhang Hui, Zheng Jiping, Han Qiuting. BTreeU-Topk: Binary-Tree Based Top-k Query Algorithms on Uncertain Data[J]. Journal of Computer Research and Development, 2012, 49(10): 2095-2105.

    BTreeU-Topk:基于二叉树的不确定数据上的Top-k查询算法

    BTreeU-Topk: Binary-Tree Based Top-k Query Algorithms on Uncertain Data

    • 摘要: 应用需求的发展衍生各种查询类型,Top-k查询是交互环境下一种重要查询类型.由于数据的不确定性,传统数据上的Top-k查询技术和方法不能直接应用于不确定数据查询.在已有不确定数据上Top-k查询算法的基础上,提出基于二叉树的不确定数据上Top-k查询算法BTreeU-Topk;为了提高算法执行效率,对二叉树进行修剪操作进而提出BTreeOPTU-Topk和BTreePU-Topk算法.实验结果表明,BTreeU-Topk,BTreeOPTU-Topk以及BTreePU-Topk算法在不同数据分布以及k值增长时均优于现有算法.

       

      Abstract: Development of applications in uncertain data management area derives various kinds of queries. Among these, Top-k query has attracted considerable research attention which is an important query type and returns the most important k objects based on some characteristic values. Because of the uncertainty of application data, traditional Top-k methods and techniques could not be directly applied to query on uncertain data. In this paper, a new Top-k algorithm based on binary-tree named BTreeU-Topk is provided under existing Top-k algorithms on uncertain data. BTreeU-Topk algorithm utilizes binary-tree to simply representation of possible worlds space. To further improve the efficiency of the proposed algorithm, pruning technique is adopted and the BTreeOPTU-Topk and BTreePU-Topk algorithms are proposed. BTreeOPTU-Topk algorithm adopts optimized queue to prune branches not in final results while BTreePU-Topk algorithm utilizes pruning rule. Experimental results show that the proposed BTreeU-Topk, BTreeOPTU-Topk and BTreePU-Topk algorithms are superior to existing algorithms in two aspects: different data distributions and increase of k values. In addition, the quantity of maintained states is reduced by our proposed algorithms.

       

    /

    返回文章
    返回