Abstract:
In large scale datacenter network, link fault detection is an important way to guarantee network connectivity and the performance of large-scale online applications. Currently, the function of link fault detection is provided by middlebox or switches. With the development of software defined networking (SDN) and network function virtualization (NFV), many network functions are decoupled from the hardware devices, while being deployed in the cloud as services. However, the existing methods of link fault detection face some challenges, such as time consuming, high usage of bandwidth, and server overload. To tackle these challenges, we first analyze the existing work on link fault detection. Then we propose the concept of probe matrix and the probe matrix optimization based link fault detection method. We also design a service framework by combining the link fault detection controller and the SDN controller. Finally, the simulation results show that the proposed method significantly outperforms the existing work in detection period, usages of bandwidth and server CPU with tolerable computational overheads for probe matrix optimization.