Abstract:
Malware classification is the key problem in the field of malicious code analysis and intrusion detection. Existing malware classification approaches have low efficiency and poor accuracy because the raw behavior analysis data is large-scale with high noise data and interfered by random factors. To solve the above issues, taking the malware behavior reports as raw data, this paper analyzes the malware behavior characteristics, the operation similarity, the interference situation of random factors and noisy behavior data. Then it proposes a parameter valid window model for system call which improves the ability of operation sequence to describe behavior similarity. On this basis, the paper presents a malware classification approach based on naive Bayes machine learning model and parameter valid window. Moreover, an automatic malware behavior classifier prototype called MalwareFilter is designed and implemented in this paper. In case study, we evaluate the prototype using system call sequence reports generated through true malware. The experiment results show that our approach is effective, and the performance and accuracy of training and classification are improved through parameter valid window.