Abstract:
With the bandwidth of backbone network link increasing geometrically, mining the frequent items over network flows promptly and accurately is important for network management and network security. Inspired by SS counting method, an integrated weighted frequent items mining algorithm IWFIM over network flows, whose pruning strategy is subject to the constraints of time and flow length, is proposed according to the property of flows. The weight of each flow item is endowed by time and flow length and the item with the minimum weight is deleted during the operation of pruning for IWFIM. Then, based on IWFIM, another frequent items mining algorithm CBF_IWFIM with the capability of combining the advantages of hashing method and counting method is proposed according to the property of heavy-tailed distribution of flows. The improved counting Blooming filter is used to filter the majority of small flows without saving flows’ information and IWFIM is introduced to identify the frequent items afterwards for CBF_IWFIM. The experiments over real network traffic show that CBF_IWFIM and IWFIM are very space-saving and precise, and they can achieve much more reasonable measurement accuracy than other three frequent items mining algorithms like SS. Even in the situation of consuming one-third of the space cost in other three algorithms, the two algorithms CBF_IWFIM and IWFIM still perform better than other three algorithms like SS.