ISSN 1000-1239 CN 11-1777/TP

计算机研究与发展 ›› 2018, Vol. 55 ›› Issue (4): 815-830.doi: 10.7544/issn1000-1239.2018.20160970

• 信息安全 • 上一篇    下一篇

基于链路状态数据库的数据中心网络异常检测算法

许刚1,2,王展1,臧大伟1,安学军1   

  1. 1(中国科学院计算技术研究所高性能计算机研究中心 北京 100190); 2(中国科学院大学 北京 100049) (xugang10@ict.ac.cn)
  • 出版日期: 2018-04-01
  • 基金资助: 
    国家重点研发计划项目(2016YFB0200300);国家自然科学基金项目(61572464);中国科学院战略性先导科技专项(XDB24050200)

Anomaly Detection Algorithm of Data Center Network Based on LSDB

Xu Gang1,2, Wang Zhan1, Zang Dawei1, An Xuejun1   

  1. 1(High Performance Computer Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190); 2(University of Chinese Academy of Sciences, Beijing 100049)
  • Online: 2018-04-01

摘要: 目前IDC数据中心内部由于网络攻击或网络配置等原因路由会经常变化,然而由于缺乏有效的监控软件,路由异常、路由抖动难以发现,故障难以定位.数据中心业务出现网络故障时无法确认故障点导致修复时间延长、用户体验降低和运营收入减少等问题.分析了当前主流数据中心的网络架构、通信协议和路由计算原理,提出了一种基于链路状态数据库(link state database, LSDB)的数据中心网络异常检测方法LSAP,该方法通过搜集LSDB,使用改进路由算法计算全网路由形成路由择域信息库(routing information base, RIB),根据LSDB快照和RIB快照比对准确关联链路变化和路由变化,发现链路异常、路由异常,能够定位故障.LSAP基于大数据分析平台实时计算路由表,能够实现秒级处理上亿条路由信息,满足当前数据中心对于分析速率的要求.通过在数据中心网络中部署试用,LSAP能够快速发现拓扑变化、复原路由表,统计分析所有路由变化,先于业务发现路由异常、路由攻击,且对网络改动很少,被动搜集数据不影响网络自身稳定性,适用对稳定性要求较高的数据中心部署.

关键词: 数据中心网络, 链路状态数据库, 路由表, 路由异常, 快速定位

Abstract: At present, due to network attack, network configuration and etc, routing table changes frequently in data center network. However, because of the lack of effective monitoring software and routing anomalies, route flapping is difficult to find and locate the fault. When data center network failure occurs, we can’t locate the problem and it will lead to extend the time to repair, degrade the user experience and reduce operating incoming and etc. This paper analyzes the current mainstream data center network architecture, communication protocol and routing calculation principle, then we proposes LSAP which is an data center network anomaly detection method based on the link state database. According to the comparison of snapshots in LSDB and RIB, it can find abnormal link, abnormal routing and locate the fault by collecting link state database, using the improved routing algorithm to calculate the whole network routing. Based on the large data analysis platform, LSAP can compute routing table in real time, achieve processing millions of routing information in seconds and meet the requirements of the current data center for the analysis rate. Through the deployment of the trial in the data center, LSAP can quickly restore routing table, find the topology change and make statistical analysis of all the changes in the route. It has little change to the network, and doesn’t affect the stability of the network, so it applies to the data center with higher stability requirements.

Key words: data center network, link state database (LSDB), routing table, routing anomaly, fault location

中图分类号: