高级检索

    DCuckoo:基于片内摘要的高性能散列表

    DCuckoo: An Efficient Hash Table with On-Chip Summary

    • 摘要: 散列表(Hash table)由于其支持高效的记录更新与检索操作,在计算机相关的各个领域中有着广泛的应用.但散列表有2个明显的缺点:冲突和低效的内存利用.最小完美散列使用N个位置存储N条记录,解决了冲突和空间效率的问题,但该算法不支持增量的更新.目标是设计一种高效的散列表,能够支持高速查询、最坏情况可以保证的高速更新、高效的空间使用以及动态的容量改变.结合 Cuckoo 散列和 d-left 散列的实现,提出了一个新的散列表设计方案——DCuckoo.DCuckoo 使用多级子表并应用了 Cuckoo 散列中移动已有元素的机制以提高装载率,且只保留了最末级子表的指针以减少空间浪费.为了进一步优化查询性能,DCuckoo 在片内内存中使用指纹和位图作为摘要,在查询时先匹配指纹,以减少对片外内存的访问次数.对 DCuckoo 进行了一系列实验,与其他5种散列表进行比较,发现 DCuckoo 达到了设计目标,并且在各项指标上均好于已有的散列表设计.

       

      Abstract: Hash tables are extensively used in many computer-related areas because of their efficiency in query and insertion operations. However, Hash tables have two disadvantages: collisions and memory inefficiency. To solve these two disadvantages, minimal perfect Hash table uses N locations to store N incoming elements. However, MPHT doesn't support incremental updates. Therefore, in this paper, combining Cuckoo hashing and d-left hashing, we propose a novel Hash table architecture called DCuckoo, which ensures fast query speed, fast update speed in worst cases, efficient utilization of memory and dynamic capacity change. In DCuckoo, multiple sub-tables and Cuckoo hashing's mechanism of transferring existing elements are used to improve the load factor. Pointers except for ones in the last sub-table are eliminated for less wasted space. Also, in order to optimize the query performance, fingerprints and bitmaps are used as a summary in on-chip memory to reduce off-chip memory accesses. The bucket will be probed only if the corresponding fingerprint is matched in on-chip memory. We conduct a series of experiments to compare the performance of DCuckoo and other five Hash table schemas. Results demonstrate that DCuckoo eliminates shortcomings of both Cuckoo hashing and d-left hashing, hence DCuckoo achieves the four design goals.

       

    /

    返回文章
    返回