高级检索

    云计算系统可靠性研究综述

    Reliability in Cloud Computing System: A Review

    • 摘要: 云计算作为一种新型计算模式,已经受到了学术界和工业界的广泛关注.基于资源虚拟化技术,云计算能够以按需使用、按使用量付费的方式为用户提供基础设施、平台、软件等服务.因此,越来越多的企业和组织选择云计算来部署他们的科学或商业应用.然而,随着用户数量的不断增加,数据中心的规模在迅速扩大、架构变得日益复杂,导致云计算系统的运行故障频繁发生,造成了巨大的损失.因此在规模巨大、架构复杂的云计算系统中,如何保障系统的可靠性已经成为一个极具挑战性的问题.针对云计算可靠性问题,概述了云计算系统中常见的各种故障,并详细描述了目前云计算中提高可靠性关键的故障管理技术;由于故障管理技术的应用会不可避免地增加系统的能耗,因此介绍了云计算中可靠性与能耗权衡问题的研究现状;最后列举了当前云计算可靠性研究中存在的主要挑战.

       

      Abstract: As a new computing paradigm, cloud computing has attracts extensive concerns from both academic and industrial fields. Based on resource virtualization technology, cloud computing provides users with services in the forms of infrastructure, platform and software in a “pay-as-you-go” manner. In the meanwhile, since cloud computing provides highly scalable computing resources, more and more enterprises and organizations choose cloud computing platforms to deploy their scientific or commercial applications. However, with the increasing number of cloud users, cloud data centers continuously expand and the architecture becomes increasingly complex, leading to growing runtime failures in cloud computing systems. Therefore, how to ensure the system reliability in cloud computing systems with large scale and complex architecture has become a huge challenge. This paper first summarizes various failures in cloud systems, introduces several methods to evaluate the reliability of cloud computing, and describes some key fault management mechanisms. Since fault management techniques inevitably increase energy consumption of cloud systems, this paper reviews current researches on the trade-off between reliability and energy efficiency in cloud computing. In the end, we propose some major challenges in current research of cloud computing reliability and concludes our paper.

       

    /

    返回文章
    返回