Abstract:
As it possesses characteristics such as high concurrency, high scalability, high cost-effectiveness and so on, cluster storage has become an important method to build large data centers. As the system grows, its failure increases dramatically, and how to improve the system reliability to ensure continuous data access has become a key issue to cluster storage. Cluster RAID system is the extension of traditional RAID technology in cluster storage environment, and provides an effective solution to the problem of system reliability. An analytic reliability model of cluster RAID5 storage system based on the Markov model is proposed, for the purpose of quantitative analysis of the impact of a variety of parameters such as the storage topology, the nodes failure rate, the disk failure rate, rebuild speed, etc., on the cluster RAID system reliability. Analysis shows that: Firstly, multiple tiers cluster RAID5 takes advantage over single tier cluster RAID5, and is better suited to building a cluster RAID system; Secondly, increasing the rebuild speed could bring almost the same amplitude improvement of cluster RAID5 system reliability; Thirdly, when maintaining the same system reliability, enhancing the rebuild speed could decrease the need of large node MTTF(mean time to failure),and 10 times of rebuild speed improvement at most could reduce to one-seventh of the original node MTTF.