ISSN 1000-1239 CN 11-1777/TP

• 论文 •

面向分布式应用管理的混合故障诊断模型

1. (北京航空航天大学计算机学院 北京 100191) (lych@buaa.edu.cn)
• 出版日期: 2010-03-15

A Hybrid Fault Diagnosis Model in Distributed Application Management

Li Yunchun and Qin Xianlong

1. (School of Computer Science and Engineering, Beihang University, Beijing 100191)
• Online: 2010-03-15

Abstract: Fault management is a key research topic in the field of distributed applications management. Due to the dynamic and complexity of distributed applications, traditional methods cant meet the need of the fault management. Autonomic computing becomes a solution to solve the problem in order to realize systems self-management. Basically, self-management is divided into two procedures: self-awareness and self-adapting. This paper mainly deals with actualizing system self-awareness based on fault diagnosis. Firstly, a hybrid fault diagnosis model is proposed after analyzing the fault propagation in distributed application management. According to this model, the fault diagnosis process is divided into two steps: application service fault diagnosis and network service fault diagnosis. Secondly, because the observation of the network faults is uncertain and inaccurate, fault diagnosis model is mapped to Bayesian network to carry out uncertainty reasoning. Finally, due to the complexity of the exact inference algorithm in Bayesian network, some improvements are added to the original inference algorithm for diagnosing the root cause based on multi-layers Bayesian network corresponding to multi-layers FPM model. As experiments shown, the improved algorithm accelerates inferring procedure.