Abstract:
Peer-to-peer (P2P) overlay networks are typical distributed systems in nature, which have attracted more and more attentions. At present, the P2P technology has been applied in file sharing, streaming media, instant messaging, and other fields. Besides, P2P network traffic accounts for more than 60% of Internet traffic. In order to better manage and control the P2P traffic, it is necessary to study a P2P traffic identification model in depth. Firstly, a machine learning model based on the wavelet support vector machine (ML-WSVM) is proposed to identify known and unknown P2P traffic. In the ML-WSVM model, the combination of the wavelet with the support vector machine is implemented by the wavelet basis function which satisfies the wavelet framework and the Mercer theorem instead of the existing support vector machine kernel functions. The proposed model makes full use of multi-scale features of the wavelet and the advantages of the support vector machine used in the classification. Then, the improved sequential minimization optimization (SMO) algorithm based on a loss function is proposed to solve the optimal hyperplane of the ML-WSVM model. Finally, the theoretical analysis and experimental results show that the ML-WSVM model can greatly improve the identification accuracy and identification efficiency of P2P network traffic, particularly to identify the encrypted packets.