ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2016, Vol. 53 ›› Issue (2): 443-448.doi: 10.7544/issn1000-1239.2016.20148040

• 系统结构 • 上一篇    下一篇

基于频繁序列挖掘的预取算法研究与实现

王芳1,2,王培群2,朱春节1   

  1. 1(武汉光电国家实验室(华中科技大学) 武汉 430074); 2(华中科技大学计算机科学与技术学院 武汉 430074) (wangpeiqun@hust.edu.cn)
  • 出版日期: 2016-02-01
  • 基金资助: 
    国家“八六三”高技术研究发展计划基金项目(2013AA013203);华中科技大学自主创新研究基金项目(2014QN010)

Study and Implementation of Frequent Sequences Mining Based Prefetching Algorithm

Wang Fang1,2, Wang Peiqun2, Zhu Chunjie1   

  1. 1(Wuhan National Laboratory for Optoetectronics (Huazhong University of Science and Technology), Wuhan 430074); 2(School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074)
  • Online: 2016-02-01

摘要: 预取作为一种提升存储系统性能的有效手段被广泛使用,然而传统的预取算法大多基于顺序性访问特征的探测,这使得它们在非顺序数据访问环境下很难奏效,甚至可能因为预取准确率较低而对存储系统的性能带来负面影响.而基于频繁序列挖掘的预取算法则能够通过分析数据的访问行为找出潜在规律,从而能在非顺序访问模式下也取得一定的性能提升.同时,为了应对某些缓存受限的应用场景,如嵌入式系统,预取算法通过提高分析的准确率减少预取可能对缓存带来的不利影响.新提出的预取算法基于频繁序列挖掘技术,并使用字典树组织预取规则,通过多步匹配和子树分割技术精细地控制规则的使用,提升预取的准确率,从而使得预取算法能够有效提升存储系统的性能.

关键词: 频繁序列挖掘, 预取算法, 字典树, 多步匹配, 子树分割

Abstract: Prefetching technology is widely used as an efficient means to improve the performance of storage systems. However, traditional prefetching algorithms are mostly based on detecting sequential access features, which makes them hard to work in the environment with less or no sequential access features. Whats worse, the storage system may even suffer from negative effects with poor prefetching accuracy. Whereas the proposed prefetching algorithm based on frequent sequences mining can make some contributions to the storage system in such environment by analyzing the behavior of the data accessing to find the potential rules. Meanwhile, in some application scenarios where the cache capacity may be limited, such as the embedded system, the proposed prefetching algorithm improves the prefetching accuracy to avoid some adverse impacts which may be caused by prefetching. The new proposed prefetching algorithm is based on the frequent sequences mining technology, and the prefetching rules derived from the mined frequent sequences are organized in a Trie tree. To improve the accuracy of the prefetching, the multistep matching technology and the subtree partitioning technology are introduced, which can subtly control the using of prefetching rules, so that the prefetching algorithm with relatively high prefetching accuracy can efficiently improve the performance of the storage system.

Key words: frequent sequences mining, prefetching algorithm, Trie tree, multistep matching, subtree partitioning

中图分类号: