ISSN 1000-1239 CN 11-1777/TP

Journal of Computer Research and Development ›› 2016, Vol. 53 ›› Issue (2): 431-442.doi: 10.7544/issn1000-1239.2016.20148327

A Dynamic Replica Management Mechanism Based on File Support Degree

Xiao Zhongzheng1, Chen Ningjiang1, Jia Jionghao1, Zhang Wenbo2   

  1. 1(School of Computer and Electronic Information, Guangxi University, Nanning 530004); 2(Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing 100190)
  • Online:2016-02-01

Abstract: Replication-based management schema is an important fault tolerance mechanism in large scale distributed storage systems. In response to the demand of dynamic replication management in distributed storage systems, a file popularity index named file support degree and its computation model are proposed. Within this model, file’s parameters are periodically collected. By combination of self-correlation of file support degree, file hits in previous collection cycle, accessed data volume and file’s grade, a model that exactly reflects files’ replication requirement is built. To adapt to the variable system load, the model dynamically adjusts its parameters, making the replication decision-making to reflect real system status. Based on these work, some algorithms like load balancing, replication adjustment and replication clearing are designed. To avoid a single data storage node being overloaded, a data storage nodes’ load-balance strategy is proposed. In this strategy, data storage nodes are divided into 3 groups: a holding group, an acceptable group and a begging group. There are 2 periodic procedures in the system, including replication adjusting procedure and replication clearing procedure. In replication adjusting procedure, top P files are replicated to data storage nodes selected based on the load-balance strategy. Replication clearing procedure is a long-periodic procedure, because it needs many adjusting procedures to make the begging group be empty. This dynamic replication management mechanism is proven effective through the given experimentations.

Key words: distributed storage, dynamic replication management, load balancing, file support degree, fault tolerance

