高级检索

    数据网格中请求呈现分组特性的副本管理策略研究

    Replication Strategies in Data Grid Systems with Clustered Demands

    • 摘要: 在数据网格中,数据使用模式将影响系统性能.根据一些实际系统的测试结果,数据请求呈现出分组特性.为研究当数据请求呈现分组特性时请求分布与副本分布的关系,首先定义了数据网格中副本复制策略的模型,然后研究在数据请求呈现分组特性时平均访问延迟最小的最优策略.采用拉格朗日乘子法以及二分法对上述模型进行求解,得到了一个在请求分组模式下的最优下载副本策略.通过模拟实验对最优策略以及均匀复制策略、比例复制策略、平方根复制策略、LRU缓存策略的性能进行了比较.结果表明,最优策略所需广域网带宽最少,平均访问延迟最小.

       

      Abstract: In data grid systems, data usage pattern plays an important role in system performance. According to some recent traces about real systems, data request and replica distribution exhibit clustering properties. Considered in this paper is the relationship between request distribution and replica distribution in data grid where request exhibits clustering properties. First the formal model of replication strategies in federated data grid system is given. The performance metrics include cumulative hit ratios and average access latency. Then investigated is what is the optimal way to replicate data with the objective of minimizing average access latency when request exhibits clustering properties. In the sense of minimizing average access latency, it is found that the more popular a file in a subgrid, the more replicas should be created in this subgrid; furthermore, when requests distribute uniformly in system, replicas should be uniformly distributed in system too. The optimization model is solved by means of Lagrange multiplier method and bisection method. Then, an optimization downloading replication strategy for clustering demands is obtained. The performance of this strategy is compared with that of uniform replication strategy, proportional replication strategy, square root replication strategy and LRU caching strategy through simulation. Simulation results validate the effectiveness of optimal strategy. Compared with these popular strategies, the optimal strategy has advantages of least wide area network bandwidth requirement and least average access latency.

       

    /

    返回文章
    返回