高级检索
    王海峰, 陈庆奎. 多指标自趋优的GPU集群能耗控制模型[J]. 计算机研究与发展, 2015, 52(1): 105-115. DOI: 10.7544/issn1000-1239.2015.20131195
    引用本文: 王海峰, 陈庆奎. 多指标自趋优的GPU集群能耗控制模型[J]. 计算机研究与发展, 2015, 52(1): 105-115. DOI: 10.7544/issn1000-1239.2015.20131195
    Wang Haifeng, Chen Qingkui. Multi-Indices Self-Approximate Optimal Power Consumption Control Model of GPU Clusters[J]. Journal of Computer Research and Development, 2015, 52(1): 105-115. DOI: 10.7544/issn1000-1239.2015.20131195
    Citation: Wang Haifeng, Chen Qingkui. Multi-Indices Self-Approximate Optimal Power Consumption Control Model of GPU Clusters[J]. Journal of Computer Research and Development, 2015, 52(1): 105-115. DOI: 10.7544/issn1000-1239.2015.20131195

    多指标自趋优的GPU集群能耗控制模型

    Multi-Indices Self-Approximate Optimal Power Consumption Control Model of GPU Clusters

    • 摘要: 在大规模流数据实时处理领域中图形处理器(graphics processing unit, GPU)集群是一种重要的并行计算系统,对计算速度、能耗和可靠性3项指标都有较高要求.然而各指标互相约束,在实时计算中需要动态寻找最优均衡点,因此GPU集群中多项性能指标实时优化成为一个具有挑战性的问题.为综合考虑计算速度、能耗和可靠性3项指标,利用极大熵函数法把多项指标转化为一个综合性能评价指标,再以模型预测控制理论为基础构造一个自适应强的控制模型,该模型能够依据计算负载的变化动态调整集群内节点的能耗状态,在保证计算速度和可靠性的前提下消减冗余计算能耗.与未考虑可靠性的基准控制模型进行对比实验,结果表明所提出的模型具有较好的控制稳定性和鲁棒性,适合应用到GPU集群节能管理中.

       

      Abstract: GPU clusters have become important high-performance parallel computing systems in the large-scale stream data field. In practice, the computing requires high computing speed, less power consumption and better reliability.So GPU clusters have three significantly performance indices restrainting each others that are computing speed, power consumption and reliability. In real-time computing phase, it needs to dynamically search the optimal point that is the tradeoff among computing speed, power consumption optimization and reliability. So the multi-indices optimization in GPU clusters power consumption control process is a challenging issue. To consider the three indices simultaneously, a comprehensive index is generated by maxinum entropy function that can combine them. Then an adaptable control model is built based on model prediction theory that can dynamically scale power consumption status with the workloads variation. This control model can cap the redundant energy consumption and control the power consumption of the GPU clusters under a specific ideal set point while guaranteeing computing speed and reliability. Compared with the control scheme without considering reliability, the results demonstrate that the proposed control scheme has better control stability and robustness and is very suitable to apply into GPU cluster power management projects to handle the real-time large-scale stream data.

       

    /

    返回文章
    返回