Abstract:
As smart terminals and social networks are increasingly integrated into people’s daily life, user behavior identification for social software plays an increasingly important role in network management, network environment supervision, and market research. Social software commonly uses end-to-end encryption protocols for encrypted data transmission, and existing methods usually extract statistical features of the encrypted data for behavior identification. However, these methods have unstable identification performance and require a large amount of data, and these drawbacks affect the practicality of these methods. We propose a social software user behavior identification method for encrypted traffic. First, stable control flow data are identified from the encrypted traffic, and the control service packet payload length sequence is extracted. Then, two neural network models are then designed to automatically extract features from control flow payload length sequences to identify user behavior at a fine granularity. Finally, experiments are conducted with WhatsApp as an example, and the precision, recall, and
F1
-score of the two neural network models for recognizing WhatsApp user behavior are over 96%. The experimental comparison with similar work proves the stability of the identification performance of the method. In addition, the method can achieve high identification precision with a few control packets, which is of great relevance to the study of real-time behavior identification.